Updated 2022 (markdown)

Vadim Pisarevsky 2022-07-27 18:07:55 +03:00
parent 45a9b1bd25
commit 1c9bdd44ad

25
2022.md

@ -31,6 +31,31 @@
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
</pre>
## 2022-07-27
It's GSoC 2022 phase-1 time now!
Here is some intermediate status from our respective mentors:
* Yuantao F:
1. Lightweight object detection model for OpenCV model zoo. The student has provided sample code and quantized model (variation of nanodet): https://github.com/opencv/opencv_zoo/pull/75. Mixed performance, but overall it's a pass. For phase 2 yet another 1-2 models should be added, including variations of yolo and another nanodet.
2. Multi-task CV model: pull request with code and model in PyTorch format is also there: https://github.com/opencv/opencv_zoo/pull/74. ONNX model is not provided yet. And the model is not quantized yet. At least, it will be converted to FP16 if INT8 does not work properly.
* Zihao M:
1. Text detection project. PR with DB model is submitted: https://github.com/opencv/opencv_zoo/pull/73. The model works very well, 2Mb (English version) or <10Mb (English+Chinese), it's 2-3x faster than the previous model that was submitted to OpenCV at GSoC 2020.
2. Text recognition. PR is also submitted: https://github.com/opencv/opencv_zoo/pull/71. Now the quality of the model is being estimated. The model is not quantized yet. Zihao is adding missing operations to OpenCV to support the model.
* Jean-Yves Bouguet:
1. Maksym is working on multi-camera calibration, including pinhole and fisheye cameras. Some tools in Python have been implemented to visualize calibration process. Now some code is implemented to evaluate how robust is the configuration of cameras, how stable was the calibration process. In other words, we want to detect corner cases, e.g. when the board views occupy very small field of view. Also, the code is going to be tested on real data, not just artificial datasets used so far. Overall, several normal and challenging test cases are going to added. Quite satisfied with the performance.
* Jia W:
1. Speech recognition model. 2 models are considered. The first model works not well but supported by OpenCV. Another model gives better quality, but is not supported by OpenCV yet (and is also very slow). Pass, but with reservations.
2. Data augmentation project. The project goes well. 10x-50x faster than TorchVision data augmentation framework. For now only traditional vision transformations are supported. It's a definite pass.
* Prof. Xing:
1. RISC-V acceleration. The student Liutong works very well, submitted the first PR (and it was merged). It's a pass, of course.
* Suleyman T:
1. Image codecs project improvements. The student did some progress, 2 PRs have been submitted. Not brilliant progress, but the student should pass to the 2nd stage.
<pre>
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
</pre>
## 2022-07-13
* Vadim: