Updated GSoC_2022 (markdown)

Aleksandr Voron 2022-03-10 11:31:02 +03:00
parent 4f5eace751
commit 7fa08aae7d

@ -111,11 +111,11 @@ All work is in C++ unless otherwise noted.
# Ideas:
1. #### _IDEA:_ Audiovisual speech recognition demo
* ***Description:*** Last summer we brought support for audio I/O and some speech recognition models into OpenCV. This time we want to add more speech recognition examples. One particularly interesting demo would be to recognize speech using joint video+audio input, i.e. use video feed as extra data to improve speech recognition
* ***Description:*** Last summer we brought support for audio I/O and some speech recognition models into OpenCV. This time we want to add more speech recognition examples. One particularly interesting demo would be to recognize speech using joint video+audio input, i.e. use video feed as extra data to improve speech recognition. We would like to support [the following model](https://paperswithcode.com/paper/towards-practical-lipreading-with-distilled)
* ***Expected Outcomes:***
* Demo that reads video + audio using OpenCV API, analyzes speech and prints the recognized text in realtime in console or separate UI window.
* ***Resources:***
* [Audiovisual Speech Recognition & Lipreading](https://paperswithcode.com/task/lipreading)
* [Audiovisual Speech Recognition & Lipreading Model](https://paperswithcode.com/paper/towards-practical-lipreading-with-distilled)
* ***Skills Required:*** training and using deep learning; basic to good knowledge of NLP; good C++/Python coding skills.
* ***Possible Mentors:*** Batanina Liubov
* ***Duration:*** 175 hours