Updated GSoC_2022 (markdown)

Vadim Pisarevsky 2022-02-17 13:49:45 +03:00
parent 31cf5f8ab6
commit 2b80fc1acd

@ -82,9 +82,16 @@ Mailing list to discuss: [opencv-gsoc-2022 mailing list](https://groups.google.c
| Index | to | Ideas | Below |
| ------------------------ | ------------------------- | -------------- | ----------------- |
| [Audiovisual Speech Recognition](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-audiovisual-speech-recognition-demo) | [FP16 & BF16 for DNN](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-FP16-Compute-Paths-in-OpenCV-DNN) | [Improved imgcodecs](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-improved-imgcodecs) | [ Ficus bindings](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-ficus-bindings) |
| [Point Cloud Compression](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-point-cloud-compression) | [Demo for Android](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-demo-for-android) | [ONNX/numpy operators](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-implement-onnx-numpy-operations-in-core) | [Image augmentation for DNN](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-image-augmentation-to-assist-dl-training) |
| [Multi-task CV models](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-multitask-cv-models) | [Simple Triangle Rendering](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-simple-triangle-rendering) |
| [Audiovisual Speech Recognition](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-audiovisual-speech-recognition-demo)
| [FP16 & BF16 for DNN](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-FP16-Compute-Paths-in-OpenCV-DNN)
| [Improved imgcodecs](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-improved-imgcodecs)
| [ Ficus bindings](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-ficus-bindings) |
| [Point Cloud Compression](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-point-cloud-compression)
| [Simple Triangle Rendering](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-simple-triangle-rendering)
| [Demo for Android](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-demo-for-android)
| [ONNX/numpy operators](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-implement-onnx-numpy-operations-in-core)
| [Image augmentation for DNN](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-image-augmentation-to-assist-dl-training)
| [Multi-task CV models](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-multitask-cv-models)
| [Idea Template](https://github.com/opencv/opencv/wiki/GSoC_2022#idea-template)| [ ]() | [ ]() |
All work is in C++ unless otherwise noted.
@ -98,7 +105,6 @@ All work is in C++ unless otherwise noted.
* Demo that reads video + audio using OpenCV API, analyzes speech and prints the recognized text in realtime in console or separate UI window.
* ***Resources:***
* [Audiovisual Speech Recognition & Lipreading](https://paperswithcode.com/task/lipreading)
* ***Skills Required:*** training and using deep learning; basic to good knowledge of NLP; good C++/Python coding skills.
* ***Possible Mentors:*** Batanina Liubov
* ***Difficulty:*** Medium
@ -117,7 +123,7 @@ All work is in C++ unless otherwise noted.
* ***Difficulty:*** Hard
1. #### _IDEA:_ Improved imgcodecs
* ***Description:*** It's not a single big project but rather a series of tasks under the "image codecs" umbrella. The goal is to make a series of improvements in the existing opencv_imgcodecs module, such as (but not limited to, please, propose new ideas):
* ***Description:*** It's not a single big project but rather a series of tasks under the "image codecs" umbrella. The goal is to make a series of improvements in the existing opencv_imgcodecs module, including, but not limited to, the following items (please, feel free to propose your ideas):
* replace libpng with much more compact and equally efficient [libspng](https://github.com/randy408/libspng/)
* probably, insert some bits of [pngcrush](https://pmt.sourceforge.io/pngcrush/) into png encoder to achieve better compression
* enable inline assembly in libjpeg-turbo in order to boost performance.
@ -133,15 +139,33 @@ All work is in C++ unless otherwise noted.
* ***Difficulty:*** Easy to Medium
1. #### _IDEA:_ Ficus Bindings
* ***Description:*** Extend Ficus OpenCV bindings to cover more modules. Preferably create a script to generate bindings automatically. Add more examples to use OpenCV from Ficus.
* ***Description:*** Extend OpenCV bindings for Ficus programming language to cover more modules. Preferably create a script to generate bindings automatically. Add more examples to use OpenCV from Ficus.
* ***Expected Outcomes:***
* Patches for ficus OpenCV bindings to bring in more functionality
* Several OpenCV C++/Python examples converted to Ficus
* Documentation (probably, a chapter in the Ficus tutorial)
* ***Resources:***
* [Ficus at Github](https://github.com/vpisarev/ficus); inside the doc directory you may find the tutorial.
* [Current Ficus OpenCV bindings](https://github.com/vpisarev/ficus/blob/master/lib/OpenCV.fx)
* [Object detection example in Ficus](https://github.com/vpisarev/ficus/blob/master/examples/objdetect.fx) that contains build instructions
* ***Skills Required:*** Mastery experience coding in C/C++, basic knowledge of functional languages, such as Ocaml, SML, Haskell, Rust.
* ***Possible Mentors:*** Vadim Pisarevsky
* ***Difficulty:*** Medium (without automatic bindings generator) to Hard (with the automatic generator)
1. #### _IDEA:_ Point Cloud Compression
* ***Description:*** Implement algorithm(s) to compress point clouds and optionally compress meshes. In OpenCV 5.x there is new [3D module](https://github.com/opencv/opencv/tree/5.x/modules/3d) that contains algorithms for 3D data processing. Point cloud is one of the fundamental representations of 3D data and there is growing number of point cloud processing functions, including I/O, RANSAC-based registration, plane fitting etc. Point cloud compression is another desired functionality to add, where we throw away some points from the cloud while preserving the overall geometry that the point cloud implicitly describes.
* ***Expected Outcomes:***
* Point cloud compression algorithm developed for OpenCV 3d module (need to follow OpenCV coding style)
* Unit tests and at least 1 example in C++ or Python that demonstrates the functionality
* Documentation/tutorial on how to use this functionality.
* (optional, but desired) optional integration of OpenCV with [Google's Draco](https://github.com/google/draco) library that provides rich functionality for 3D data compression.
* ***Resources:***
* [Point cloud compression tutorial from Pointclouds library](https://pointclouds.org/documentation/tutorials/compression.html)
* [Introduction to point cloud compression by ZTE](https://res-www.zte.com.cn/mediares/magazine/publication/com_en/article/201803/XUYiling.pdf)
* ***Skills Required:*** Mastery experience coding in C/C++, good understanding of clustering algorithms and, more generally, 3D data processing algorithms.
* ***Possible Mentors:*** Rostislav Vasilikhin
* ***Difficulty:*** Medium to Hard
1. #### _IDEA:_ Simple triangle rendering
* ***Description:*** Some 3D algorithms require mesh rendering as their inner part, for example as a feedback for 3d reconstruction. Depending on algorithm type, a depth or a color rendering is required. This function can also be used as a simple debugging tool. Since light, shadows and correct texture rendering are separate huge problems, they are out of scope for this task.
* ***Expected Outcomes:***
@ -156,6 +180,22 @@ All work is in C++ unless otherwise noted.
* ***Possible Mentors:*** Rostislav Vasilikhin
* ***Difficulty:*** Easy to Medium, depending on chosen task scope
1. #### _IDEA:_ Demo for Android
* ***Description:*** Computer vision on a mobile phone has been also a hot area. Many years ago we already similar projects for Android and iOS. Now it's time to do this project again. Modern Android + OpenCV + its deep learning module + camera (maybe even with extra depth sensor on phones that have it). It can be a really cool project that will show your skills that that will be extremely helpful for many OpenCV users.
* ***Expected Outcomes:***
* Demo app for Android with all the source code available, no binary blobs or proprietary components.
* The app should preferably be native (in C++).
* The app can use camera via native Android API for via OpenCV API. The latter is preferable, not not necessary.
* The app must use OpenCV DNN module. It should run some deep net using OpenCV and visualize the results (e.g. draw rectangles around found objects, the fancier visualization the better).
* The app should work in realtime or a reasonably fast phone. If it's slow, it's better to choose some lighter net or run it on a lower resolution.
* Preferably the app should not be tightly bound to a particular topology. For example, if it's object detection demo, it should be possible to change the topology to another one, not via UI, just replace some file and change it's name in config file or source code.
* There should be a short tutorial (a markdown file with the key code fragments and some screenshot on how to build it and run.
* There should be some efforts applied (see https://github.com/opencv/opencv/wiki/Compact-build-advice) to make the application compact. That was a separate request from various OpenCV users, and a compact mobile app would be the best example how to do that.
* ***Skills Required:*** Mastery experience coding in C/C++ or Java, practical experience with developing apps for Android. At least basic understanding of computer vision and deep learning technologies, enough to run a deep net and make it run at realtime speed.
* ***Possible Mentors:*** Vadim Pisarevsky
* ***Difficulty:*** Medium to Hard
----
### _Idea Template:_
```