G-API: Support CoreML Execution Providers for ONNXRT Backend #24068
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
- Use the same tools and plugins for SDK build and AAR build
- Added script to test Gradle-based samples against local maven repo
- Various local fixes and debug prints
This patch change lsx to baseline feature, and lasx to dispatch
feature. Additionally, the runtime detection methods for lasx and
lsx have been modified.
* add Winograd FP16 implementation
* fixed dispatching of FP16 code paths in dnn; use dynamic dispatcher only when NEON_FP16 is enabled in the build and the feature is present in the host CPU at runtime
* fixed some warnings
* hopefully fixed winograd on x64 (and maybe other platforms)
---------
Co-authored-by: Vadim Pisarevsky <vadim.pisarevsky@gmail.com>
Updated Android samples for modern Android studio. Added OpenCV from Maven support. #24473
Updated samples for recent Android studio:
- added namespace field that is required in build.gradle files
- replaced _switch_ by _if-else_ because it doesn't work with constants from resources
- added missed log library dependency in face-detection/jni/CMakeLists.txt
- use local.properties to define NDK location
Added support for OpenCV from Maven. Now you can choose 3 possible sources of OpenCV lib in settings.gradle: SDK path, local Maven repository, public Maven repository. (Creating Maven repository from SDK is added here #24456 )
There are differences in project configs for SDK and Maven versions:
- different dependencies in build.gradle
- different OpenCV library names in CMakeLists.txt
- SDK version requires OpenCV_DIR definition
Requires:
- https://github.com/opencv/ci-gha-workflow/pull/124
- https://github.com/opencv-infrastructure/opencv-gha-dockerfile/pull/26
G-API: Advanced device selection for ONNX DirectML Execution Provider #24060
### Overview
Extend `cv::gapi::onnx::ep::DirectML` to accept `adapter name` as `ctor` parameter in order to select execution device by `name`.
E.g:
```
pp.cfgAddExecutionProvider(cv::gapi::onnx::ep::DirectML("Intel Graphics"));
```
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
* Optimize some function with lasx.
Optimize some function with lasx. #23929
This patch optimizes some lasx functions and reduces the runtime of opencv_test_core from 662,238ms to 633603ms on the 3A5000 platform.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Enable multicore CUDA compilation #24382
CUDA source files are compiled single threaded. The option `--threads` was introduced in NVCC 11.2. The option specifies the number of threads to be used for compilation (see [NVIDIA NVCC Documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#threads-number-t)).
With CMake 3.12 the environment variable `CMAKE_BUILD_PARALLEL_LEVEL` was introduced (see [CMake Documentation](https://cmake.org/cmake/help/latest/envvar/CMAKE_BUILD_PARALLEL_LEVEL.html)). This variable is used to set the NVCC `--threads` option.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Supporting protobuf v22 and later(with abseil-cpp/C++17) #24372
fix https://github.com/opencv/opencv/issues/24369
related https://github.com/opencv/opencv/issues/23791
1. This patch supports external protobuf v22 and later, it required abseil-cpp and c++17.
Even if the built-in protobuf is upgraded to v22 or later,
the dependency on abseil-cpp and the requirement for C++17 will continue.
2. Some test for caffe required patched protobuf, so this patch disable them.
This patch is tested by following libraries.
- Protobuf: /usr/local/lib/libprotobuf.so (4.24.4)
- abseil-cpp: YES (20230125)
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
* added more or less cross-platform (based on POSIX signal() semantics) method to detect various NEON extensions, such as FP16 SIMD arithmetics, BF16 SIMD arithmetics, SIMD dotprod etc. It could be propagated to other instruction sets if necessary.
* hopefully fixed compile errors
* continue to fix CI
* another attempt to fix build on Linux aarch64
* * reverted to the original method to detect special arm neon instructions without signal()
* renamed FP16_SIMD & BF16_SIMD to NEON_FP16 and NEON_BF16, respectively
* removed extra whitespaces
dnn: cleanup of halide backend for 5.x #24231
Merge with https://github.com/opencv/opencv_extra/pull/1092.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
dnn: cleanup of tengine backend #24122🚀 Cleanup for OpenCV 5.0. Tengine backend is added for convolution layer speedup on ARM CPUs, but it is not maintained and the convolution layer on our default backend has reached similar performance to that of Tengine.
Tengine backend related PRs:
- https://github.com/opencv/opencv/pull/16724
- https://github.com/opencv/opencv/pull/18323
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Update opencv dnn to support cann version >=6.3 #23936
1.modify the search path of "libopsproto.so" in OpenCVFindCANN.cmake
2.add the search path of "libgraph_base.so" in OpenCVFindCANN.cmake
3.automatic check Ascend socVersion,and test on Ascend310/Ascend310B/Ascend910B well
G-API: Support DirectML Execution Provider for ONNXRT Backend #24045
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [ ] I agree to contribute to the project under Apache 2 License.
- [ ] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [ ] The PR is proposed to the proper branch
- [ ] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Add AVIF support through libavif. #23596
This is to fix https://github.com/opencv/opencv/issues/19271
Extra: https://github.com/opencv/opencv_extra/pull/1069
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Python typing stub generation #20370
Add stub generation to `gen2.py`, addressing #14590.
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
- [x] The PR is proposed to proper branch
- [x] There is reference to original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
Android: don't require deprecated tools #21736
Checking for these deprecated is no longer necessary, and infact broken on fresh Android SDK installs. Remove the check.
resolves#21735
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
AGP 8.0 build.gradle namespace and aidl buildFeature requirement added #23447
Hello,
Android Gradle Plugin version 8.0 is asking for namespace. This is become mandatory and after I update my AGP to 8.0, I got this error
```
Namespace not specified. Please specify a namespace in the module's build.gradle file like so:
android {
namespace 'com.example.namespace'
}
If the package attribute is specified in the source AndroidManifest.xml, it can be migrated automatically to the namespace value in the build.gradle file using the AGP Upgrade Assistant; please refer to https://developer.android.com/studio/build/agp-upgrade-assistant for more information.
```
This change fix this future releases. However I am not sure how opencv wants to user namespace I used "org.opencv" if there is a different namespace please let me know so I can changed that too. Also should I add namepsace into "opencv/modules/java/android_sdk/android_gradle_lib/build.gradle" here ?
### Sources
Android developer link: https://developer.android.com/studio/preview/features#namespace-dsl
Issue Tracker Google: https://issuetracker.google.com/issues/191813691?pli=1#comment19
### Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
- [ ] The feature is well documented and sample code can be built with the project CMake
* fix openmp include and link issue on macos
* turn off have_openmp if OpenMP_CXX_INCLUDE_DIRS is empty
* test commit
* use condition HAVE_OPENMP and OpenMP_CXX_LIBRARIES for linking
* remove trailing whitespace
* remove notes
* update conditions
* use OpenMP_CXX_LIBRARIES for linking
Use reinterpret instead of c-style casting for GCC
Co-authored-by: Xu Zhang <xu.zhang@hexintek.com>
Co-authored-by: Maksim Shabunin <maksim.shabunin@gmail.com>
* cann backend impl v1
* cann backend impl v2: use opencv parsers to build models for cann
* adjust fc according to the new transA and transB
* put cann net in cann backend node and reuse forwardLayer
* use fork() to create a child process and compile cann model
* remove legacy code
* remove debug code
* fall bcak to CPU backend if there is one layer not supoorted by CANN backend
* fix netInput forward
Adds the option to enable delay loading of CUDA DLLs on Windows. This is particularly useful to use the same binary on systems with and without CUDA support without distributing the CUDA DLLs to systems that cannot use them at all due to missing CUDA-supported hardware.
Resolves#13509
* cmake: Fix DirectX detection in mingw
The pragma comment directive is valid for MSVC only. So, the DirectX detection
fails in mingw. The failure is fixed by adding the required linking library
(here d3d11) in the try_compile() function in OpenCVDetectDirectX.cmake file.
Also add a message if the first DirectX check fails.
* gapi: Fix compilation with mingw
These changes remove MSVC specific pragma directive. The compilation fails at
linking time due to absence of proper linking library. The required libraries
are added in corresponding CMakeLists.txt file.
* samples: Fix compilation with mingw
These changes remove MSVC specific pragma directive. The compilation fails at
linking time due to absence of proper linking library. The required libraries
are added in corresponding CMakeLists.txt file.
Support downloading 3rdparty resources from Gitcode & Gitlab-style mirrors
* replace github.com with gitcode.net for ocv_download
* replace raw.githubusercontent.com with gitcode.net for ocv_download
* rename funtions and remove some comments
* add options for custom mirrors, which simply replace domain github.com & githubusercontent.com
* run ocv_init_download once; replace DL_URL with mirrored one when calling ocv_download
* fix for empty download links when not using mirror
* fix bugs: set(.. .. PARENT_SCOPE) for ocv_init_download; correct macro names for replace github archives and raw githubusercontent
* adjusted mirror swapping impl: replace with mirrored link before each ocv_download; update md5sum for archives
* fix a bug: macro invoked with incorrect arguments by non-set vars
* enclose if statement
* workable impl
* shorten the var names of two key options
* scalable implementation of downloading from mirror and using custom mirror
* improve ocv_init_download help message
* fix the different extracted directory name in case of ADE & TBB which are downloaded from release page
* improve help message printing
* Download ADE & TBB using commit ids instead of from release pages
* support custom mirrors on downloading archives
* improve hints
* add missing parentheses
* reset ocv_download calls
* mirror support implementation using ocv_cmake_hook & ocv_cmake_hook_append
* move ocv_init_download into cmake/OpenCVDownload.cmake
* move ocv_cmake_hook before checking CMake cache
* improve hints when not fetching as git repo
* add WORKING_DIRECTORY in execute_process in ocv_init_download
* use OPENCV_DOWNLOAD_MIRROR_ID
* add custom.cmake for custom mirror
* detect github origin
* fix broken var name
* download from github by default if custom tbb is set
* add checksum checks for gitcode.cmake before replacing urls and checksums
* add checksum checks for custom.cmake before replacing urls and checkusms
* use description specify instead of set for messages in custom.cmake; use warning message for warnings
* updates and fixes
* Added NEON support in builds for Windows on ARM
* Fixed `HAVE_CPU_NEON_SUPPORT` display broken during compiler test
* Fixed a build error prior to Visual Studio 2022
* Fix compile against lapack-3.10.0
Fix compilation against lapack >= 3.9.1 and 3.10.0 while not breaking older versions
OpenCVFindLAPACK.cmake & CMakeLists.txt: determine OPENCV_USE_LAPACK_PREFIX from LAPACK_VERSION
hal_internal.cpp : Only apply LAPACK_FUNC to functions whose number of inputs depends on LAPACK_FORTRAN_STR_LEN in lapack >= 3.9.1
lapack_check.cpp : remove LAPACK_FUNC which is not OK as function are not used with input parameters (so lapack.h preprocessing of "LAPACK_xxxx(...)" is not applicable with lapack >= 3.9.1
If not removed lapack_check fails so LAPACK is deactivated in build (not want we want)
use OCV_ prefix and don't use Global, instead generate OCV_LAPACK_FUNC depending on CMake Conditions
Remove CONFIG from find_package(LAPACK) and use LAPACK_GLOBAL and LAPACK_NAME to figure out if using netlib's reference LAPACK implementation and how to #define OCV_LAPACK_FUNC(f)
* Fix typos and grammar in comments
- QGLWidget changed to QOpenGLWidget in window_QT.h for Qt6 using
typedef OpenCVQtWidgetBase for handling Qt version
- Implement Qt6/OpenGL functionality in window_QT.cpp
- Swap QGLWidget:: function calls for OpenCVQtWidgetBase:: function calls
- QGLWidget::updateGL deprecated, swap to QOpenGLWidget::update for Qt6
- Add preprocessor definition to detect Qt6 -- HAVE_QT6
- Add OpenGLWidgets to qdeps list in highgui CMakeLists.txt
- find_package CMake command added for locating Qt module OpenGLWidgets
- Added check that Qt6::OpenGLWidgets component is found. Shut off Qt-openGL functionality if not found.
[GSoC] OpenCV.js: Accelerate OpenCV.js DNN via WebNN
* Add WebNN backend for OpenCV DNN Module
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Add WebNN head files into OpenCV 3rd partiy files
Create webnn.hpp
update cmake
Complete README and add OpenCVDetectWebNN.cmake file
add webnn.cpp
Modify webnn.cpp
Can successfully compile the codes for creating a MLContext
Update webnn.cpp
Update README.md
Update README.md
Update README.md
Update README.md
Update cmake files and
update README.md
Update OpenCVDetectWebNN.cmake and README.md
Update OpenCVDetectWebNN.cmake
Fix OpenCVDetectWebNN.cmake and update README.md
Add source webnn_cpp.cpp and libary libwebnn_proc.so
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
update dnn.cpp
update op_webnn
update op_webnn
Update op_webnn.hpp
update op_webnn.cpp & hpp
Update op_webnn.hpp
Update op_webnn
update the skeleton
Update op_webnn.cpp
Update op_webnn
Update op_webnn.cpp
Update op_webnn.cpp
Update op_webnn.hpp
update op_webnn
update op_webnn
Solved the problems of released variables.
Fixed the bugs in op_webnn.cpp
Implement op_webnn
Implement Relu by WebNN API
Update dnn.cpp for better test
Update elementwise_layers.cpp
Implement ReLU6
Update elementwise_layers.cpp
Implement SoftMax using WebNN API
Implement Reshape by WebNN API
Implement PermuteLayer by WebNN API
Implement PoolingLayer using WebNN API
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Implement poolingLayer by WebNN API and add more detailed logs
Update dnn.cpp
Update dnn.cpp
Remove redundant codes and add more logs for poolingLayer
Add more logs in the pooling layer implementation
Fix the indent issue and resolve the compiling issue
Fix the build problems
Fix the build issue
FIx the build issue
Update dnn.cpp
Update dnn.cpp
* Fix the build issue
* Implement BatchNorm Layer by WebNN API
* Update convolution_layer.cpp
This is a temporary file for Conv2d layer implementation
* Integrate some general functions into op_webnn.cpp&hpp
* Update const_layer.cpp
* Update convolution_layer.cpp
Still have some bugs that should be fixed.
* Update conv2d layer and fc layer
still have some problems to be fixed.
* update constLayer, conv layer, fc layer
There are still some bugs to be fixed.
* Fix the build issue
* Update concat_layer.cpp
Still have some bugs to be fixed.
* Update conv2d layer, fully connected layer and const layer
* Update convolution_layer.cpp
* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)
* Delete bib19450.aux
* Add WebNN backend for OpenCV DNN Module
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Add WebNN head files into OpenCV 3rd partiy files
Create webnn.hpp
update cmake
Complete README and add OpenCVDetectWebNN.cmake file
add webnn.cpp
Modify webnn.cpp
Can successfully compile the codes for creating a MLContext
Update webnn.cpp
Update README.md
Update README.md
Update README.md
Update README.md
Update cmake files and
update README.md
Update OpenCVDetectWebNN.cmake and README.md
Update OpenCVDetectWebNN.cmake
Fix OpenCVDetectWebNN.cmake and update README.md
Add source webnn_cpp.cpp and libary libwebnn_proc.so
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
Update dnn.cpp
update dnn.cpp
update op_webnn
update op_webnn
Update op_webnn.hpp
update op_webnn.cpp & hpp
Update op_webnn.hpp
Update op_webnn
update the skeleton
Update op_webnn.cpp
Update op_webnn
Update op_webnn.cpp
Update op_webnn.cpp
Update op_webnn.hpp
update op_webnn
update op_webnn
Solved the problems of released variables.
Fixed the bugs in op_webnn.cpp
Implement op_webnn
Implement Relu by WebNN API
Update dnn.cpp for better test
Update elementwise_layers.cpp
Implement ReLU6
Update elementwise_layers.cpp
Implement SoftMax using WebNN API
Implement Reshape by WebNN API
Implement PermuteLayer by WebNN API
Implement PoolingLayer using WebNN API
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Update pooling_layer.cpp
Implement poolingLayer by WebNN API and add more detailed logs
Update dnn.cpp
Update dnn.cpp
Remove redundant codes and add more logs for poolingLayer
Add more logs in the pooling layer implementation
Fix the indent issue and resolve the compiling issue
Fix the build problems
Fix the build issue
FIx the build issue
Update dnn.cpp
Update dnn.cpp
* Fix the build issue
* Implement BatchNorm Layer by WebNN API
* Update convolution_layer.cpp
This is a temporary file for Conv2d layer implementation
* Integrate some general functions into op_webnn.cpp&hpp
* Update const_layer.cpp
* Update convolution_layer.cpp
Still have some bugs that should be fixed.
* Update conv2d layer and fc layer
still have some problems to be fixed.
* update constLayer, conv layer, fc layer
There are still some bugs to be fixed.
* Update conv2d layer, fully connected layer and const layer
* Update convolution_layer.cpp
* Add OpenCV.js DNN module WebNN Backend (both using webnn-polyfill and electron)
* Update dnn.cpp
* Fix Error in dnn.cpp
* Resolve duplication in conditions in convolution_layer.cpp
* Fixed the issues in the comments
* Fix building issue
* Update tutorial
* Fixed comments
* Address the comments
* Update CMakeLists.txt
* Offer more accurate perf test on native
* Add better perf tests for both native and web
* Modify per tests for better results
* Use more latest version of Electron
* Support latest WebNN Clamp op
* Add definition of HAVE_WEBNN macro
* Support group convolution
* Implement Scale_layer using WebNN
* Add Softmax option for native classification example
* Fix comments
* Fix comments