opencv/doc/tutorials/imgproc/pyramids/pyramids.markdown

209 lines
7.2 KiB
Markdown
Raw Normal View History

2014-11-27 20:39:05 +08:00
Image Pyramids {#tutorial_pyramids}
==============
2020-12-08 00:13:54 +08:00
@tableofcontents
2017-08-23 22:37:11 +08:00
@prev_tutorial{tutorial_morph_lines_detection}
@next_tutorial{tutorial_threshold}
2020-12-05 06:46:00 +08:00
| | |
| -: | :- |
| Original author | Ana Huamán |
| Compatibility | OpenCV >= 3.0 |
2014-11-27 20:39:05 +08:00
Goal
----
In this tutorial you will learn how to:
2017-08-23 22:37:11 +08:00
- Use the OpenCV functions **pyrUp()** and **pyrDown()** to downsample or upsample a given
2014-11-27 20:39:05 +08:00
image.
Theory
------
2014-11-28 00:54:13 +08:00
@note The explanation below belongs to the book **Learning OpenCV** by Bradski and Kaehler.
2014-11-27 20:39:05 +08:00
- Usually we need to convert an image to a size different than its original. For this, there are
two possible options:
2014-11-28 21:21:28 +08:00
-# *Upsize* the image (zoom in) or
-# *Downsize* it (zoom out).
2014-11-27 20:39:05 +08:00
- Although there is a *geometric transformation* function in OpenCV that -literally- resize an
2017-08-23 22:37:11 +08:00
image (**resize** , which we will show in a future tutorial), in this section we analyze
2014-11-27 20:39:05 +08:00
first the use of **Image Pyramids**, which are widely applied in a huge range of vision
applications.
### Image Pyramid
- An image pyramid is a collection of images - all arising from a single original image - that are
successively downsampled until some desired stopping point is reached.
- There are two common kinds of image pyramids:
- **Gaussian pyramid:** Used to downsample images
- **Laplacian pyramid:** Used to reconstruct an upsampled image from an image lower in the
pyramid (with less resolution)
- In this tutorial we'll use the *Gaussian pyramid*.
### Gaussian Pyramid
2014-11-27 20:39:05 +08:00
- Imagine the pyramid as a set of layers in which the higher the layer, the smaller the size.
2014-11-28 21:21:28 +08:00
![](images/Pyramids_Tutorial_Pyramid_Theory.png)
2014-11-27 20:39:05 +08:00
- Every layer is numbered from bottom to top, so layer \f$(i+1)\f$ (denoted as \f$G_{i+1}\f$ is smaller
than layer \f$i\f$ (\f$G_{i}\f$).
- To produce layer \f$(i+1)\f$ in the Gaussian pyramid, we do the following:
- Convolve \f$G_{i}\f$ with a Gaussian kernel:
\f[\frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}\f]
2014-11-27 20:39:05 +08:00
- Remove every even-numbered row and column.
- You can easily notice that the resulting image will be exactly one-quarter the area of its
predecessor. Iterating this process on the input image \f$G_{0}\f$ (original image) produces the
entire pyramid.
- The procedure above was useful to downsample an image. What if we want to make it bigger?:
2017-08-23 22:37:11 +08:00
columns filled with zeros (\f$0 \f$)
2018-09-28 05:37:59 +08:00
- First, upsize the image to twice the original in each dimension, with the new even rows and
2014-11-27 20:39:05 +08:00
- Perform a convolution with the same kernel shown above (multiplied by 4) to approximate the
values of the "missing pixels"
- These two procedures (downsampling and upsampling as explained above) are implemented by the
2017-08-23 22:37:11 +08:00
OpenCV functions **pyrUp()** and **pyrDown()** , as we will see in an example with the
2014-11-27 20:39:05 +08:00
code below:
2014-11-28 00:54:13 +08:00
@note When we reduce the size of an image, we are actually *losing* information of the image.
Code
----
2014-11-27 20:39:05 +08:00
2017-08-23 22:37:11 +08:00
This tutorial code's is shown lines below.
@add_toggle_cpp
You can also download it from
2021-12-22 21:01:26 +08:00
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp)
2017-08-23 22:37:11 +08:00
@include samples/cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp
@end_toggle
@add_toggle_java
You can also download it from
2021-12-22 21:01:26 +08:00
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/java/tutorial_code/ImgProc/Pyramids/Pyramids.java)
2017-08-23 22:37:11 +08:00
@include samples/java/tutorial_code/ImgProc/Pyramids/Pyramids.java
@end_toggle
2014-11-27 20:39:05 +08:00
2017-08-23 22:37:11 +08:00
@add_toggle_python
You can also download it from
2021-12-22 21:01:26 +08:00
[here](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/python/tutorial_code/imgProc/Pyramids/pyramids.py)
2017-08-23 22:37:11 +08:00
@include samples/python/tutorial_code/imgProc/Pyramids/pyramids.py
@end_toggle
2014-11-28 00:54:13 +08:00
2014-11-27 20:39:05 +08:00
Explanation
-----------
2014-11-28 00:54:13 +08:00
Let's check the general structure of the program:
### Load an image
2017-08-23 22:37:11 +08:00
@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp load
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java load
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py load
@end_toggle
### Create window
2017-08-23 22:37:11 +08:00
@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp show_image
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java show_image
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py show_image
@end_toggle
2014-11-28 00:54:13 +08:00
### Loop
2014-11-28 00:54:13 +08:00
2017-08-23 22:37:11 +08:00
@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp loop
@end_toggle
2014-11-28 00:54:13 +08:00
2017-08-23 22:37:11 +08:00
@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java loop
@end_toggle
2017-08-23 22:37:11 +08:00
@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py loop
@end_toggle
2014-11-28 00:54:13 +08:00
2017-08-23 22:37:11 +08:00
Perform an infinite loop waiting for user input.
Our program exits if the user presses **ESC**. Besides, it has two options:
2014-11-28 00:54:13 +08:00
2017-08-23 22:37:11 +08:00
- **Perform upsampling - Zoom 'i'n (after pressing 'i')**
We use the function **pyrUp()** with three arguments:
- *src*: The current and destination image (to be shown on screen, supposedly the double of the
2014-11-28 00:54:13 +08:00
input image)
2017-08-23 22:37:11 +08:00
- *Size( tmp.cols*2, tmp.rows\*2 )* : The destination size. Since we are upsampling,
**pyrUp()** expects a size double than the input image (in this case *src*).
@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp pyrup
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java pyrup
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py pyrup
@end_toggle
- **Perform downsampling - Zoom 'o'ut (after pressing 'o')**
We use the function **pyrDown()** with three arguments (similarly to **pyrUp()**):
- *src*: The current and destination image (to be shown on screen, supposedly half the input
image)
2021-02-17 17:03:16 +08:00
- *Size( tmp.cols/2, tmp.rows/2 )* : The destination size. Since we are downsampling,
2017-08-23 22:37:11 +08:00
**pyrDown()** expects half the size the input image (in this case *src*).
@add_toggle_cpp
@snippet cpp/tutorial_code/ImgProc/Pyramids/Pyramids.cpp pyrdown
@end_toggle
@add_toggle_java
@snippet java/tutorial_code/ImgProc/Pyramids/Pyramids.java pyrdown
@end_toggle
@add_toggle_python
@snippet python/tutorial_code/imgProc/Pyramids/pyramids.py pyrdown
@end_toggle
Notice that it is important that the input image can be divided by a factor of two (in both dimensions).
Otherwise, an error will be shown.
2014-11-28 00:54:13 +08:00
2014-11-27 20:39:05 +08:00
Results
-------
2021-12-22 21:01:26 +08:00
- The program calls by default an image [chicky_512.png](https://raw.githubusercontent.com/opencv/opencv/4.x/samples/data/chicky_512.png)
2017-08-23 22:37:11 +08:00
that comes in the `samples/data` folder. Notice that this image is \f$512 \times 512\f$,
2014-11-27 20:39:05 +08:00
hence a downsample won't generate any error (\f$512 = 2^{9}\f$). The original image is shown below:
2014-11-28 21:21:28 +08:00
![](images/Pyramids_Tutorial_Original_Image.jpg)
2014-11-27 20:39:05 +08:00
2017-08-23 22:37:11 +08:00
- First we apply two successive **pyrDown()** operations by pressing 'd'. Our output is:
2014-11-27 20:39:05 +08:00
2014-11-28 21:21:28 +08:00
![](images/Pyramids_Tutorial_PyrDown_Result.jpg)
2014-11-27 20:39:05 +08:00
- Note that we should have lost some resolution due to the fact that we are diminishing the size
2017-08-23 22:37:11 +08:00
of the image. This is evident after we apply **pyrUp()** twice (by pressing 'u'). Our output
2014-11-27 20:39:05 +08:00
is now:
2014-11-28 21:21:28 +08:00
![](images/Pyramids_Tutorial_PyrUp_Result.jpg)