opencv/doc/tutorials/gpu/gpu-basics-similarity/gpu-basics-similarity.rst

.. _gpuBasicsSimilarity:

Similarity check (PNSR and SSIM) on the GPU
*******************************************

Goal
====

In the :ref:`videoInputPSNRMSSIM` tutorial I already presented the PSNR and SSIM methods for
checking the similarity between the two images. And as you could see there performing these takes
quite some time, especially in the case of the SSIM. However, if the performance numbers of an
OpenCV implementation for the CPU do not satisfy you and you happen to have an NVidia CUDA GPU
device in your system all is not lost. You may try to port or write your algorithm for the video
card.

This tutorial will give a good grasp on how to approach coding by using the GPU module of OpenCV. As
a prerequisite you should already know how to handle the core, highgui and imgproc modules. So, our
goals are:

.. container:: enumeratevisibleitemswithsquare

   + What's different compared to the CPU?
   + Create the GPU code for the PSNR and SSIM
   + Optimize the code for maximal performance

The source code
===============

You may also find the source code and these video file in the
:file:`samples/cpp/tutorial_code/gpu/gpu-basics-similarity/gpu-basics-similarity` folder of the
OpenCV source library or :download:`download it from here
<../../../../samples/cpp/tutorial_code/gpu/gpu-basics-similarity/gpu-basics-similarity.cpp>`. The
full source code is quite long (due to the controlling of the application via the command line
arguments and performance measurement). Therefore, to avoid cluttering up these sections with those
you'll find here only the functions itself.

The PSNR returns a float number, that if the two inputs are similar between 30 and 50 (higher is
better).

.. literalinclude:: ../../../../samples/cpp/tutorial_code/gpu/gpu-basics-similarity/gpu-basics-similarity.cpp
   :language: cpp
   :linenos:
   :tab-width: 4
   :lines: 165-210, 18-23, 210-235

The SSIM returns the MSSIM of the images. This is too a float number between zero and one (higher is
better), however we have one for each channel. Therefore, we return a *Scalar* OpenCV data
structure:

.. literalinclude:: ../../../../samples/cpp/tutorial_code/gpu/gpu-basics-similarity/gpu-basics-similarity.cpp
   :language: cpp
   :linenos:
   :tab-width: 4
   :lines: 235-355, 26-42, 357-

How to do it? - The GPU
=======================

Now as you can see we have three types of functions for each operation. One for the CPU and two for
the GPU. The reason I made two for the GPU is too illustrate that often simple porting your CPU to
GPU will actually make it slower. If you want some performance gain you will need to remember a few
rules, whose I'm going to detail later on.

The development of the GPU module was made so that it resembles as much as possible its CPU
counterpart. This is to make porting easy. The first thing you need to do before writing any code is
to link the GPU module to your project, and include the header file for the module. All the
functions and data structures of the GPU are in a *gpu* sub namespace of the *cv* namespace. You may
add this to the default one via the *use namespace* keyword, or mark it everywhere explicitly via
the cv:: to avoid confusion. I'll do the later.

.. code-block:: cpp

   #include <opencv2/gpu.hpp>        // GPU structures and methods

GPU stands for **g**\ raphics **p**\ rocessing **u**\ nit. It was originally build to render
graphical scenes. These scenes somehow build on a lot of data. Nevertheless, these aren't all
dependent one from another in a sequential way and as it is possible a parallel processing of them.
Due to this a GPU will contain multiple smaller processing units. These aren't the state of the art
processors and on a one on one test with a CPU it will fall behind. However, its strength lies in
its numbers. In the last years there has been an increasing trend to harvest these massive parallel
powers of the GPU in non-graphical scene rendering too. This gave birth to the general-purpose
computation on graphics processing units (GPGPU).

The GPU has its own memory. When you read data from th