47 GSoC_2025
Gary Bradski edited this page 2025-05-16 09:05:20 -07:00

OpenCV Google Summer of Code 2025

Jump to Ideas List

Example use of computer vision: Vision Transformer

      Parent of this page   Last year's idea page   IDEAS LIST below


OpenCV Accepted Projects:

Mentor only list

Contributor Title Mentors Notes
Liigion Integrate Fractal ArUco rmsalinas 🏃
Jinyue Chen QR/Barcode/ArUco Detector Shiqi Yu 🏃
Jorge Belez Tokenizer Integration in DNN FYtao 🏃
Mykhailo Trushch Computational Photography Gursimar Singh 🏃
Oleksandr Nikolskyy LLM Runtime Support Vadim, Alexander, Gary 🏃
ShauryaKpanwar SLAM Gary, Reza 🏃
shyma7004 Multi-camera Calibration FYtao 🏃
Suleyman Turkmen Enhancing imgcodecs Modules Vincent 🏃
Xuanrui Zhu libcamera Support in VideoCapture Jia 🏃

Important dates:

Date (2025) Description Comment
Jan 27 Organization Applications Open 👍
Feb 11 Org Application Deadline 👍
Feb 27 Accepted Orgs Announced 💯
Mar 24 Contributor Proposals Open 👍
Apr 08 Contributor Proposal Deadline 👍
Apr 29 Contributor Ranking Deadline 👍
May 06 Slot Allocation Deadline 👍
May 08 Accepted Projects Announced 👍
May 8-Jun 1 Bonding 🏃
Jun 2 Coding Starts
jul 14-18 Midterm Evals
Aug 25-Sep 1 Code in, Contrib Evals
Sep 1-8 Mentor Evals
Nov 10 Extended Contrib Evals
Nov 17 Extended Mentor Evals

      GSoC Timeline   UTC time   UTC time converter

Resources:

OpenCV Project Ideas List:

Project Discussion list

Index to Ideas Below

  1. Multi-camera calibration part 3
  2. Multi-camera calibration test
  3. Multi-camera calibration toolbox
  4. Quantized models for OpenCV Model Zoo
  5. RISC-V Optimizations
  6. Dynamic CUDA support in DNN
  7. Synchronized multi-camera video recorder
  8. libcamera back-end for VideoCapture
  9. Better LLMs support in OpenCV (1)
  10. Better LLMs support in OpenCV (2)
  11. Computational photography algorithms for better image quality
  12. Improve OpenCV security
  13. Integrate Fractal ArUco into OpenCV
  14. Integrate JuMarker ArUco into OpenCV
  15. LightGlue Matcher with Aliked Feature
  16. Basic SLAM
  17. QR/Barcode/ArUco detector

Idea Template

All work is in C++ unless otherwise noted.


Ideas:

  1. IDEA: Multi-camera calibration part 3

    • Description: During GSoC 2023 a new cool multi-camera calibration algorithm was improved: https://github.com/opencv/opencv/pull/24052. This year we would like to finish this work with more test cases, tune the accuracy and build higher-level user-friendly tool (based on the script from the tutorial) to perform multi-camera calibration. If this is completed before the internship is up, then we'll move on to leveraging the IMU or marker-free calibration.
    • Expected Outcomes:
      • A series of patches with more unit tests and bug fixes for the multi-camera calibration algorithm
      • New/improved documentation on how to calibrate cameras
      • A short YouTube video showing off how to use the calibration routines
    • Skills Required: Mastery of C++ and Python, mathematical knowledge of camera calibration, ability to code up mathematical models
    • Difficulty: Medium-Difficult
    • Possible Mentors: Maksym Ivashechkin, Alexander Smorkalov
    • Duration: 175 hours
  2. IDEA: Multi-camera calibration test

    • Description: We are looking for a student to curate best of class calibration data, collect calibration data with various OpenCV Fiducials, and graphically produce calibration board and camera models data (script). Simultaneously, begin to write comprehensive test scripts of all the existing calibration functions. While doing this, if necessary, improve the calibration documentation. Derive from this expected accuracy of fiducial types for various camera types.
    • Expected Outcomes:
      • Curate camera calibration data from public datasets.
      • Collect calibration data for various fiducials and camera types.
      • Graphically create camera calibration data with ready to go scripts
      • Write test functions for the OpenCV Calibration pipeline
      • New/improved documentation on how to calibrate cameras as needed.
      • Statistical analysis of the performance (accuracy and variance) of OpenCV fiducials, algorithms and camera types.
      • A YouTube video showing describing and demonstrating the OpenCV Calibration testss.
    • Resources: OpenCV Fiducial Markers, OpenCV Calibration Functions, OpenCV Camera Calibration Tutorial 1, OpenCV Camera Calibration Tutorial 2
    • Skills Required: Mastery of C++ and Python, mathematical knowledge of camera calibration, ability to code up mathematical models
    • Difficulty: Medium
    • Possible Mentors: Jean-Yves Bouguet, Alexander Smorkalov
    • Duration: 175 hours
  3. IDEA: Multi-camera calibration toolbox

    • Description: Build a higher-level user-friendly tool (based on the script from the calibration tutorial) to perform multi-camera calibration. This should allow easy multi-camera calibration with at multiple Charco patterns and possibly other calibration fiducial patterns. The results will use Monte-Carlo sampling to determine parameter stability, allow easy switching of camera models and output the camera calibration parameters and the fiducial patterns pose in space as well as the extrinsic locations of each camera relative to the others.
    • Expected Outcomes:
      • Tool with convenient API that will be more or less comparable and compatible with Kalibr tool (https://github.com/ethz-asl/kalibr)
      • New/improved documentation on how to calibrate cameras
      • A Youtube video demonstrating how to use the box
    • Skills Required: Python, mathematical knowledge of camera calibration, ability to code up mathematical models
    • Difficulty: Medium-Difficult
    • Possible Mentors: Jean-Yves Bouguet, Gary Bradski
    • Duration: 175 hours
  4. IDEA: Quantized models for OpenCV Model Zoo

    • Description: Many modern CPUs, GPUs and specialized NPUs include special instructions and hardware blocks for accelerated inference, especially for INT8 inference. The models don't just become ~4x smaller compared to FP32 original models, the inference speed increases significantly (by 2x-4x or more) as well. The number of quantized models steadily increases, however, beyond image classification there are not so many 8-bit computer vision models with proven high-quality results. We will be interested to add to our model zoo (https://github.com/opencv/opencv_zoo) 8-bit models for object detection, optical flow, pose estimation, text detection and recognition etc.
    • Expected Outcomes:
      • Series of patches to OpenCV Zoo and maybe to OpenCV DNN (when OpenCV DNN misses 8-bit flavors of certain operations) to add the corresponding models.
      • If quantization is performed by student during the project, we will request the corresponding scripts to perform the quantization
      • Benchmark results to prove the quality of the quantized models along with the corresponding scripts so that we can reproduce it.
    • Skills Required: very good ML engineering skills, good Python programming skills, familiarity with model quantization algorithms and model quality assessment approaches
    • Possible Mentors: Feng Yuantao, Zhong Wanli, Vadim Pisarevsky
    • Difficulty: Medium
    • Duration: 90 to 175 hours, depending on the particular model.
  5. IDEA: RISC-V Optimizations

    • Description: RISC-V is one of main target platforms for OpenCV. During past several years we brought in some RISC-V optimizations based on RISC-V Vector extension by adding another backend to OpenCV scalable universal intrinsics. We refactored a lot of code in OpenCV to make the vectorized loops compatible with RISC-V backend and more or less efficient. Still, we see a lot of gaps and the performance of certain functions can be further improved. For some critical functions, like convolution in deep learning, it makes sense perhaps to implement custom loops using native RVV intrinsics instead of using OpenCV scalable universal intrinsics. This is what we invite you to do.
    • Expected Outcomes:
      • A series of patches for core, imgproc, video and dnn modules to bring improved loops that use OpenCV scalable universal intrinsics or native RVV intrinsics to improve the performance. In the first case the optimizations should not degrade performance on other major platforms like x86-64 or ARMv8 with NEON.
    • Resources:
    • Skills Required: mastery plus experience coding in C++; good skills of optimizing code using SIMD.
    • Possible Mentors: Mingjie Xing, Maxim Shabunin
    • Difficulty: Hard
    • Duration: 350 hours
  6. IDEA: Dynamic CUDA support in DNN

    • Description: OpenCV DNN module includes several backends for efficient inference on various platforms. Some of the backends are heavy and bring in a lot of dependencies, so it makes sense to make the backends dynamic. Recently, we did it with OpenVINO backend: https://github.com/opencv/opencv/pull/21745. The goal of this project is to make CUDA backend of OpenCV DNN dynamic as well. Once it's implemented, we can have a single set of OpenCV binaries and then add the necessary plugin (also in binary form) to accelerate inference on NVidia GPUs without recompiling OpenCV.
    • Expected Outcomes:
      • A series of patches for dnn and maybe core module to build OpenCV DNN CUDA plugin as a separate binary that could be used by OpenCV DNN. In this case OpenCV itself should not have any dependency of CUDA SDK or runtime - the plugin should encapsulate it. It is fine if the user-supplied tensors (cv::Mat) are automatically uploaded to GPU memory by the engine (cv::dnn::Net) before the inference and the output tensors are downloaded from GPU memory after the inference in such a case.
    • Resources:
    • Skills Required: mastery plus experience coding in C++; good practical experience in CUDA. Acquaintance with deep learning is desirable but not necessary, since the project is mostly about software engineering, not about ML algorithms or their optimization.
    • Possible Mentors: Alexander Smorkalov
    • Difficulty: Hard
    • Duration: 350 hours
  7. IDEA: Synchronized multi camera video recorder

    • Description: Multi-camera calibration and multi-view scenarios require synchronous recording with multiple cameras. Need to tune cv::VideoCapture or/and VideoWriter and implement sample for video recording with several cameras with timestamps
    • Expected Outcomes:
      • Sync video recording sample for several cameras: V4L2, RTSP(?)
    • Resources: Overview
    • Skills Required: C++
    • Possible Mentors: Alexander S.
    • Difficulty: Easy-Medium
    • Duration: 175
  8. IDEA: libcamera back end for VideoCapture

    • Description: Discussion: #21653
    • Expected Outcomes:
      • MIPI camera support on Raspberry Pi
    • Resources:
      • Skills Required: C++, Linux
      • Possible Mentors: TBD
      • Difficulty: Medium
      • Duration: 175
  9. IDEA: Better LLMs support in OpenCV (1)

  10. IDEA: Better LLMs support in OpenCV (2)

    • Description: LLMs support in OpenCV is unclear. One of the most key feature in the current dnn engine is fixed memory allocation after model importing, which helps speeding up model inference for CNNs but then becomes the limit of running LLMs that has flexible inputs. This project aims to try LLMs as many as possible with OpenCV, fix dnn engine to support more LLMs, and write demos to show LLMs inference with OpenCV.
    • Expected Outcomes:
      • Several patches that fix dnn engine to support more LLMs.
      • Several patches that add demos to show LLMs inference with OpenCV.
    • Resources:
    • Skills Required: C++, Python, LLMs/Transformers
    • Mentor: Yuantao Feng
    • Difficulty: Hard
    • Duration: 175
  11. IDEA: Computational photography algorithms for better image quality

    • Description: Improving image quality is important task, which is still not covered well in OpenCV. We already have "non-local means" (photo module) and BM3D (opencv_contrib) denoising algorithms, simple white balance algorithms (opencv_contrib), very simple exposure correction function (equalizeHist in imgproc: grayscale images only) and function for distortion correction (undistort function in imgproc), that's it. The following could be useful to have:

      • more efficient/better-quality denoising algorithms
      • vignetting correction
      • chromatic aberration correction
      • smarter white balance algorithms
      • exposure correction for color images
      • multi-frame (image burst) denoising
      • superresolution for still images and video
      • deblurring
      • color enhancement, defogging
      • etc.

      Note that:

      1. this idea is not about any special effects, 'beautification' etc. It's about improving pure technical image quality
      2. the idea is quite big, applicant(s) may and probably should suggest to implement a subset of the above items, they can also add something on top (as long as note (a) above is taken into account).
    • Expected Outcomes:

      • Several patches to opencv_photo module and/or opencv_contrib repo that add the new functionality, tests, samples etc.
    • Resources:

      • TBD
    • Skills Required: C++, Python

    • Mentor(s): Gursimar Singh, Vadim Pisarevsky as adviser.

    • Difficulty: Hard

    • Duration: 175

  12. IDEA: Improve OpenCV's security

  13. IDEA: Integrate Fractal ArUco into OpenCV

    • Description: Fractal markers are a new concept of marker, which is composed of several fiducial square markers of different size inside. Unlike traditional fiducial markers, the structure of this marker can be detected from a large number of distances, as well as solve problems of partial or total occlusion of the marker.
    • Expected Outcomes:
      • Integrate Fractal ArUco into OpenCV with a simple API, which should be similar to the ArUco API currently in OpenCV.
      • Detailed documents for Fractal ArUco API in OpenCV
      • A nice demo to show how to use the algorithm.
    • Resources: Frictal ArUco
    • Skills Required: C++, Python.
    • Mentor: Rafael Muñoz Salinas, Shiqi Yu
    • Difficulty: Easy
    • Duration: 175 hours
  14. IDEA: Integrate JuMarker ArUco into OpenCV

    • Description: Fiducial markers such as QR codes, ArUco, and AprilTag have become very popular tools for labeling and camera positioning. They are robust and easy to detect, even in devices with low computing power. However, their industrial appearance deters their use in scenarios where an attractive and visually appealing look is required. In these cases, it would be preferable to use customized markers showing, for instance, a company logo. This work proposes a novel method to design, detect, and track customizable fiducial markers. Our work allows creating markers templates imposing few restrictions on its design, e.g., a company logo or a picture can be used. The designer must indicate positions into the template where bits will encode a unique identifier for each marker. Then, our method will automatically create a dictionary of markers, all following the same design, but each with a unique identifier.
    • Expected Outcomes:
      • Integrate JuMarker ArUco into OpenCV with a simple API, which should be similar to the ArUco API currently in OpenCV.
      • Detailed documents for Fractal ArUco API in OpenCV
      • A nice demo to show how to create a JuMarker and to detect it.
    • Resources: JuMarker ArUco
    • Skills Required: C++, Python.
    • Mentor: Rafael Muñoz Salinas, Shiqi Yu
    • Difficulty: Hard
    • Duration: 350 hours
  15. IDEA: LightGlue Matcher with Aliked Feature

    • Description: Add the LightGlue feature matcher into opencv (to join the BFMatcher and the FLAAN matcher), then add the ALIKED feature to the feature detector descriptor so that we can use one of the features (ALIKED, SIFT, SURF, ORB, BRISK ...) and match points between two images. Extra, add subpixel accurate detectors keypt2subpx as a post processor on keypoints formatting them correctly.
    • Expected Outcomes:
      • Add LightGlue as a new feature matcher
      • Add ALIKED to DNN as a feature detector descriptor, formatting the output data to work with OpenCV's feature matchers
      • Add Subpixel accurate keypoint adjustment keypt2subpx
      • Create an example of subpixel accurate feature matching between pairs of images
      • Create test code, documentation and a video of it working
    • Resources:
    • Skills Required: Python, Computer vision AI model training, pytorch
    • Mentor: Gary Bradski, Gursimar Singh
    • Difficulty: Medium
    • Duration: 200
  16. IDEA: Basic SLAM

  17. IDEA: QR/Barcode/ArUco detector

    • Description: QR, Barcode and ArUco are all popular code in computer vision applications. OpenCV now support all of them, and can detect and decode them. But OpenCV still expect a better detector and decoder for them. If possible, one efficient deep detector for all kinds of codes can simplify the usage. If one efficent deep detector cannot be achieved, several deep models are also acceptable.
    • Expected Outcomes:
      • Train a deep detector for QR, Barcode and ArUco. Or train three different deep detectors for different codes specifically.
      • The trained model should be easy to implement with the current algorithms in OpenCV on QR/Barcode/ArUco.
      • A nice demo to show how to use the algorithm.
      • Detailed report to demontrate if the trained detector(s) are better than the current solution in OpenCV.
    • Resources:
    • Skills Required: C++, Python, and experience on object detection.
    • Mentor: Shiqi Yu
    • Difficulty: Hard
    • Duration: 350 hours

Idea Template:

1. #### _IDEA:_ Your title here
   * ***Description:*** 3-7 sentences describing the task
   * ***Expected Outcomes:***
      * < Short bullet list describing what is to be accomplished >
      * <i.e. create a new module called "bla bla">
      * < Has method to accomplish X >
      * <...>
   * ***Resources:***
      * [For example a paper citation](https://arxiv.org/pdf/1802.08091.pdf)
      * [For example an existing feature request](https://github.com/opencv/opencv/issues/11013)
      * [Possibly an existing related module](https://github.com/opencv/opencv_contrib/tree/master/modules/optflow) that includes some new optical flow algorithms.
   * ***Skills Required:*** < for example mastery plus experience coding in C++, college course work in vision that covers optical flow, python. Best if you have also worked with deep neural networks. >
   * ***Mentor:*** < your name goes here >
   * ***Difficulty:*** <Easy, Medium, Hard>
   * ***Duration:*** <175 <normal> 350 <extended>>


Contributors

How to Apply

The process at Google is described at GSoC home page

How contributors will be evaluated once working:

  • Contributors will be paid only if:
    • Phase 1:
      • You must generate a pull request
        • That builds
        • Has at least stubbed out (place holder functions such as just displaying an image) functionality
        • With OpenCV appropriate Doxygen documentation (example tutorial)
          • Includes What the function or net is, what the function or net is used for
        • Has at least stubbed out unit test
        • Has a stubbed out example/tutorial of use that builds
    • Phase 2:
      • You must generate a pull request
        • That builds
        • Has all or most of the planned functionality (but still usable without those missing parts)
        • With OpenCV appropriate Doxygen documentation
          • Includes What the function or net is, what the function or net is used for
        • Has some unit tests
        • Has a tutorial/sample of how to use the function or net and why you'd want to use it.
      • Optionally, but highly desirable: create a (short! 30sec-1min) Movie (preferably on Youtube, but any movie) that demonstrates your project. We will use it to create the final video:
    • Extended period:
      • TBD

Mentors:

  1. Contact us, preferably in February or early March, on the opencv-gsoc googlegroups mailing list above and ask to be a mentor (or we will ask you in some known cases)
  2. If we accept you, we will post a request from the Google Summer of Code OpenCV project site asking you to join.
  3. You must accept the request and you are a mentor!
  1. You then:
    • Look through the ideas above, choose one you'd like to mentor or create your own and post it for discussion on the mentor list.
    • Go to the opencv-gsoc googlegroups mailing list above and look through the project proposals and discussions. Discuss the ideas you've chosen.
      • Find likely contributors, ask them to apply to your project(s)
    • You will get a list of contributors who have applied to your project. Go through them and select a contributor or rejecting them all if none suits and joining to co-mentor or to quit this year are acceptable outcomes.
  2. Then, when we get a slot allocation from Google, the administrators "spend" the slots in order of priority influenced by whether there's a capable mentor or not for each topic.
  3. Contributors must finally actually accept to do that project (some sign up for multiple organizations and then choose)
  4. Get to work!

If you are accepted as a mentor and you find a suitable contributor and we give you a slot and the contributor signs up for it, then you are an actual mentor! Otherwise you are not a mentor and have no other obligations.

  • Thank you for trying.
  • You may contact other mentors and co-mentor a project.

You get paid a modest stipend over the summer to mentor, typically $500 minus an org fee of 10%.

Several mentors donate their salary, earning ever better positions in heaven when that comes.

Potential Mentors List:

Ankit Sachan
Anatoliy Talamanov
Clément Pinard
Davis King
Dmitry Kurtaev
Dmitry Matveev
Edgar Riba
Gholamreza Amayeh
Grace Vesom
Jiri Hörner
João Cartucho
Justin Shenk
Michael Tetelman
Ningxin Hu
Rafael Muñoz Salinas
Rostislav Vasilikhin
Satya Mallick
Stefano Fabri
Steven Puttemans
Sunita Nayak
Vikas Gupta
Vincent Rabaud
Vitaly Tuzov
Vladimir Tyan
Yida Wang
Jia Wu
Yuantao Feng
Zihao Mu

Admins

Gary Bradski
Vadim Pisarevsky
Shiqi Yu

GSoC Org Application Answers

Answers from our OpenCV GSoC application