Merge pull request #15942 from OrestChura:fb_tutorial

G-API: Tutorial: Face beautification algorithm implementation * Introduce a tutorial on face beautification algorithm - small typo issue in render_ocv.cpp * Addressing comments rgarnov smirnov-alexey
2025-06-27 15:01:50 +08:00 · 2019-12-17 11:00:49 +03:00 · 2019-12-17 11:00:49 +03:00 · 287874a444
commit 287874a444
parent c6c8783c60
5 changed files with 824 additions and 251 deletions
--- a/doc/tutorials/gapi/face_beautification/face_beautification.markdown
+++ b/doc/tutorials/gapi/face_beautification/face_beautification.markdown
@ -0,0 +1,440 @@
+# Implementing a face beautification algorithm with G-API {#tutorial_gapi_face_beautification}
+
+[TOC]
+
+# Introduction {#gapi_fb_intro}
+
+In this tutorial you will learn:
+* Basics of a sample face beautification algorithm;
+* How to infer different networks inside a pipeline with G-API;
+* How to run a G-API pipeline on a video stream.
+
+## Prerequisites {#gapi_fb_prerec}
+
+This sample requires:
+- PC with GNU/Linux or Microsoft Windows (Apple macOS is supported but
+  was not tested);
+- OpenCV 4.2 or later built with Intel® Distribution of [OpenVINO™
+  Toolkit](https://docs.openvinotoolkit.org/) (building with [Intel®
+  TBB](https://www.threadingbuildingblocks.org/intel-tbb-tutorial) is
+  a plus);
+- The following topologies from OpenVINO™ Toolkit [Open Model
+  Zoo](https://github.com/opencv/open_model_zoo):
+  - `face-detection-adas-0001`;
+  - `facial-landmarks-35-adas-0002`.
+
+## Face beautification algorithm {#gapi_fb_algorithm}
+
+We will implement a simple face beautification algorithm using a
+combination of modern Deep Learning techniques and traditional
+Computer Vision. The general idea behind the algorithm is to make
+face skin smoother while preserving face features like eyes or a
+mouth contrast. The algorithm identifies parts of the face using a DNN
+inference, applies different filters to the parts found, and then
+combines it into the final result using basic image arithmetics:
+
+\dot
+strict digraph Pipeline {
+  node [shape=record fontname=Helvetica fontsize=10 style=filled color="#4c7aa4" fillcolor="#5b9bd5" fontcolor="white"];
+  edge [color="#62a8e7"];
+  ordering="out";
+  splines=ortho;
+  rankdir=LR;
+
+  input [label="Input"];
+  fd [label="Face\ndetector"];
+  bgMask [label="Generate\nBG mask"];
+  unshMask [label="Unsharp\nmask"];
+  bilFil [label="Bilateral\nfilter"];
+  shMask [label="Generate\nsharp mask"];
+  blMask [label="Generate\nblur mask"];
+  mul_1 [label="*" fontsize=24 shape=circle labelloc=b];
+  mul_2 [label="*" fontsize=24 shape=circle labelloc=b];
+  mul_3 [label="*" fontsize=24 shape=circle labelloc=b];
+
+  subgraph cluster_0 {
+    style=dashed
+    fontsize=10
+    ld [label="Landmarks\ndetector"];
+    label="for each face"
+  }
+
+  sum_1 [label="+" fontsize=24 shape=circle];
+  out [label="Output"];
+
+  temp_1 [style=invis shape=point width=0];
+  temp_2 [style=invis shape=point width=0];
+  temp_3 [style=invis shape=point width=0];
+  temp_4 [style=invis shape=point width=0];
+  temp_5 [style=invis shape=point width=0];
+  temp_6 [style=invis shape=point width=0];
+  temp_7 [style=invis shape=point width=0];
+  temp_8 [style=invis shape=point width=0];
+  temp_9 [style=invis shape=point width=0];
+
+  input -> temp_1 [arrowhead=none]
+  temp_1 -> fd -> ld
+  ld -> temp_4 [arrowhead=none]
+  temp_4 -> bgMask
+  bgMask -> mul_1 -> sum_1 -> out
+
+  temp_4 -> temp_5 -> temp_6 [arrowhead=none constraint=none]
+  ld -> temp_2 -> temp_3 [style=invis constraint=none]
+
+  temp_1 -> {unshMask, bilFil}
+  fd -> unshMask [style=invis constraint=none]
+  unshMask -> bilFil [style=invis constraint=none]
+
+  bgMask -> shMask [style=invis constraint=none]
+  shMask -> blMask [style=invis constraint=none]
+  mul_1 -> mul_2 [style=invis constraint=none]
+  temp_5 -> shMask -> mul_2
+  temp_6 -> blMask -> mul_3
+
+  unshMask -> temp_2 -> temp_5 [style=invis]
+  bilFil -> temp_3 -> temp_6 [style=invis]
+
+  mul_2 -> temp_7 [arrowhead=none]
+  mul_3 -> temp_8 [arrowhead=none]
+
+  temp_8 -> temp_7 [arrowhead=none constraint=none]
+  temp_7 -> sum_1 [constraint=none]
+
+  unshMask -> mul_2 [constraint=none]
+  bilFil -> mul_3 [constraint=none]
+  temp_1 -> mul_1 [constraint=none]
+}
+\enddot
+
+Briefly the algorithm is described as follows:
+- Input image \f$I\f$ is passed to unsharp mask and bilateral filters
+  (\f$U\f$ and \f$L\f$ respectively);
+- Input image \f$I\f$ is passed to an SSD-based face detector;
+- SSD result (a \f$[1 \times 1 \times 200 \times 7]\f$ blob) is parsed
+  and converted to an array of faces;
+- Every face is passed to a landmarks detector;
+- Based on landmarks found for every face, three image masks are
+  generated:
+  - A background mask \f$b\f$ -- indicating which areas from the
+    original image to keep as-is;
+  - A face part mask \f$p\f$ -- identifying regions to preserve
+    (sharpen).
+  - A face skin mask \f$s\f$ -- identifying regions to blur;
+- The final result \f$O\f$ is a composition of features above
+  calculated as \f$O = b*I + p*U + s*L\f$.
+
+Generating face element masks based on a limited set of features (just
+35 per face, including all its parts) is not very trivial and is
+described in the sections below.
+
+# Constructing a G-API pipeline {#gapi_fb_pipeline}
+
+## Declaring Deep Learning topologies {#gapi_fb_decl_nets}
+
+This sample is using two DNN detectors. Every network takes one input
+and produces one output. In G-API, networks are defined with macro
+G_API_NET():
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp net_decl
+
+To get more information, see
+[Declaring Deep Learning topologies](@ref gapi_ifd_declaring_nets)
+described in the "Face Analytics pipeline" tutorial.
+
+## Describing the processing graph {#gapi_fb_ppline}
+
+The code below generates a graph for the algorithm above:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ppl
+
+The resulting graph is a mixture of G-API's standard operations,
+user-defined operations (namespace `custom::`), and DNN inference.
+The generic function `cv::gapi::infer<>()` allows to trigger inference
+within the pipeline; networks to infer are specified as template
+parameters.  The sample code is using two versions of `cv::gapi::infer<>()`:
+- A frame-oriented one is used to detect faces on the input frame.
+- An ROI-list oriented one is used to run landmarks inference on a
+  list of faces -- this version produces an array of landmarks per
+  every face.
+
+More on this in "Face Analytics pipeline"
+([Building a GComputation](@ref gapi_ifd_gcomputation) section).
+
+## Unsharp mask in G-API {#gapi_fb_unsh}
+
+The unsharp mask \f$U\f$ for image \f$I\f$ is defined as:
+
+\f[U = I - s * L(M(I)),\f]
+
+where \f$M()\f$ is a median filter, \f$L()\f$ is the Laplace operator,
+and \f$s\f$ is a strength coefficient. While G-API doesn't provide
+this function out-of-the-box, it is expressed naturally with the
+existing G-API operations:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp unsh
+
+Note that the code snipped above is a regular C++ function defined
+with G-API types. Users can write functions like this to simplify
+graph construction; when called, this function just puts the relevant
+nodes to the pipeline it is used in.
+
+# Custom operations {#gapi_fb_proc}
+
+The face beautification graph is using custom operations
+extensively. This chapter focuses on the most interesting kernels,
+refer to [G-API Kernel API](@ref gapi_kernel_api) for general
+information on defining operations and implementing kernels in G-API.
+
+## Face detector post-processing {#gapi_fb_face_detect}
+
+A face detector output is converted to an array of faces with the
+following kernel:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp vec_ROI
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp fd_pp
+
+## Facial landmarks post-processing {#gapi_fb_landm_detect}
+
+The algorithm infers locations of face elements (like the eyes, the mouth
+and the head contour itself) using a generic facial landmarks detector
+(<a href="https://github.com/opencv/open_model_zoo/blob/master/models/intel/facial-landmarks-35-adas-0002/description/facial-landmarks-35-adas-0002.md">details</a>)
+from OpenVINO™ Open Model Zoo. However, the detected landmarks as-is are not
+enough to generate masks --- this operation requires regions of interest on
+the face represented by closed contours, so some interpolation is applied to
+get them. This landmarks
+processing and interpolation is performed by the following kernel:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_cnts
+
+The kernel takes two arrays of denormalized landmarks coordinates and
+returns an array of elements' closed contours and an array of faces'
+closed contours; in other words, outputs are, the first, an array of
+contours of image areas to be sharpened and, the second, another one
+to be smoothed.
+
+Here and below `Contour` is a vector of points.
+
+### Getting an eye contour {#gapi_fb_ld_eye}
+
+Eye contours are estimated with the following function:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_incl
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_eye
+
+Briefly, this function restores the bottom side of an eye by a
+half-ellipse based on two points in left and right eye
+corners. In fact, `cv::ellipse2Poly()` is used to approximate the eye region, and
+the function only defines ellipse parameters based on just two points:
+- The ellipse center and the \f$X\f$ half-axis calculated by two eye Points;
+- The \f$Y\f$ half-axis calculated according to the assumption that an average
+eye width is \f$1/3\f$ of its length;
+- The start and the end angles which are 0 and 180 (refer to
+  `cv::ellipse()` documentation);
+- The angle delta: how much points to produce in the contour;
+- The inclination angle of the axes.
+
+The use of the `atan2()` instead of just `atan()` in function
+`custom::getLineInclinationAngleDegrees()` is essential as it allows to
+return a negative value depending on the `x` and the `y` signs so we
+can get the right angle even in case of upside-down face arrangement
+(if we put the points in the right order, of course).
+
+### Getting a forehead contour {#gapi_fb_ld_fhd}
+
+The function  approximates the forehead contour:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_fhd
+
+As we have only jaw points in our detected landmarks, we have to get a
+half-ellipse based on three points of a jaw: the leftmost, the
+rightmost and the lowest one. The jaw width is assumed to be equal to the
+forehead width and the latter is calculated using the left and the
+right points. Speaking of the \f$Y\f$ axis, we have no points to get
+it directly, and instead assume that the forehead height is about \f$2/3\f$
+of the jaw height, which can be figured out from the face center (the
+middle between the left and right points) and the lowest jaw point.
+
+## Drawing masks {#gapi_fb_masks_drw}
+
+When we have all the contours needed, we are able to draw masks:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp msk_ppline
+
+The steps to get the masks are:
+* the "sharp" mask calculation:
+    * fill the contours that should be sharpened;
+    * blur that to get the "sharp" mask (`mskSharpG`);
+* the "bilateral" mask calculation:
+    * fill all the face contours fully;
+    * blur that;
+    * subtract areas which intersect with the "sharp" mask --- and get the
+      "bilateral" mask (`mskBlurFinal`);
+* the background mask calculation:
+    * add two previous masks
+    * set all non-zero pixels of the result as 255 (by `cv::gapi::threshold()`)
+    * revert the output (by `cv::gapi::bitwise_not`) to get the background
+      mask (`mskNoFaces`).
+
+# Configuring and running the pipeline {#gapi_fb_comp_args}
+
+Once the graph is fully expressed, we can finally compile it and run
+on real data. G-API graph compilation is the stage where the G-API
+framework actually understands which kernels and networks to use. This
+configuration happens via G-API compilation arguments.
+
+## DNN parameters {#gapi_fb_comp_args_net}
+
+This sample is using OpenVINO™ Toolkit Inference Engine backend for DL
+inference, which is configured the following way:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp net_param
+
+Every `cv::gapi::ie::Params<>` object is related to the network
+specified in its template argument. We should pass there the network
+type we have defined in `G_API_NET()` in the early beginning of the
+tutorial.
+
+Network parameters are then wrapped in `cv::gapi::NetworkPackage`:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp netw
+
+More details in "Face Analytics Pipeline"
+([Configuring the pipeline](@ref gapi_ifd_configuration) section).
+
+## Kernel packages  {#gapi_fb_comp_args_kernels}
+
+In this example we use a lot of custom kernels, in addition to that we
+use Fluid backend to optimize out memory for G-API's standard kernels
+where applicable. The resulting kernel package is formed like this:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp kern_pass_1
+
+## Compiling the streaming pipeline  {#gapi_fb_compiling}
+
+G-API optimizes execution for video streams when compiled in the
+"Streaming" mode.
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_comp
+
+More on this in "Face Analytics Pipeline"
+([Configuring the pipeline](@ref gapi_ifd_configuration) section).
+
+## Running the streaming pipeline {#gapi_fb_running}
+
+In order to run the G-API streaming pipeline, all we need is to
+specify the input video source, call
+`cv::GStreamingCompiled::start()`, and then fetch the pipeline
+processing results:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_src
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_loop
+
+Once results are ready and can be pulled from the pipeline we display
+it on the screen and handle GUI events.
+
+See [Running the pipeline](@ref gapi_ifd_running) section
+in the "Face Analytics Pipeline" tutorial for more details.
+
+# Conclusion {#gapi_fb_cncl}
+
+The tutorial has two goals: to show the use of brand new features of
+G-API introduced in OpenCV 4.2, and give a basic understanding on a
+sample face beautification algorithm.
+
+The result of the algorithm application:
+
+![Face Beautification example](pics/example.jpg)
+
+On the test machine (Intel® Core™ i7-8700) the G-API-optimized video
+pipeline outperforms its serial (non-pipelined) version by a factor of
+**2.7** -- meaning that for such a non-trivial graph, the proper
+pipelining can bring almost 3x increase in performance.
+
+<!---
+The idea in general is to implement a real-time video stream processing that
+detects faces and applies some filters to make them look beautiful (more or
+less). The pipeline is the following:
+
+Two topologies from OMZ have been used in this sample: the
+<a href="https://github.com/opencv/open_model_zoo/tree/master/models/intel
+/face-detection-adas-0001">face-detection-adas-0001</a>
+and the
+<a href="https://github.com/opencv/open_model_zoo/blob/master/models/intel
+/facial-landmarks-35-adas-0002/description/facial-landmarks-35-adas-0002.md">
+facial-landmarks-35-adas-0002</a>.
+
+The face detector takes the input image and returns a blob with the shape
+[1,1,200,7] after the inference (200 is the maximum number of
+faces which can be detected).
+In order to process every face individually, we need to convert this output to a
+list of regions on the image.
+
+The masks for different filters are built based on facial landmarks, which are
+inferred for every face. The result of the inference
+is a blob with 35 landmarks: the first 18 of them are facial elements
+(eyes, eyebrows, a nose, a mouth) and the last 17 --- a jaw contour. Landmarks
+are floating point values of coordinates normalized relatively to an input ROI
+(not the original frame). In addition, for the further goals we need contours of
+eyes, mouths, faces, etc., not the landmarks. So, post-processing of the Mat is
+also required here. The process is split into two parts --- landmarks'
+coordinates denormalization to the real pixel coordinates of the source frame
+and getting necessary closed contours based on these coordinates.
+
+The last step of processing the inference data is drawing masks using the
+calculated contours. In this demo the contours don't need to be pixel accurate,
+since masks are blurred with Gaussian filter anyway. Another point that should
+be mentioned here is getting
+three masks (for areas to be smoothed, for ones to be sharpened and for the
+background) which have no intersections with each other; this approach allows to
+apply the calculated masks to the corresponding images prepared beforehand and
+then just to summarize them to get the output image without any other actions.
+
+As we can see, this algorithm is appropriate to illustrate G-API usage
+convenience and efficiency in the context of solving a real CV/DL problem.
+
+(On detector post-proc)
+Some points to be mentioned about this kernel implementation:
+
+- It takes a `cv::Mat` from the detector and a `cv::Mat` from the input; it
+returns an array of ROI's where faces have been detected.
+
+- `cv::Mat` data parsing by the pointer on a float is used here.
+
+- By far the most important thing here is solving an issue that sometimes
+detector returns coordinates located outside of the image; if we pass such an
+ROI to be processed, errors in the landmarks detection will occur. The frame box
+`borders` is created and then intersected with the face rectangle
+(by `operator&()`) to handle such cases and save the ROI which is for sure
+inside the frame.
+
+Data parsing after the facial landmarks detector happens according to the same
+scheme with inconsiderable adjustments.
+
+
+## Possible further improvements
+
+There are some points in the algorithm to be improved.
+
+### Correct ROI reshaping for meeting conditions required by the facial landmarks detector
+
+The input of the facial landmarks detector is a square ROI, but the face
+detector gives non-square rectangles in general. If we let the backend within
+Inference-API compress the rectangle to a square by itself, the lack of
+inference accuracy can be noticed in some cases.
+There is a solution: we can give a describing square ROI instead of the
+rectangular one to the landmarks detector, so there will be no need to compress
+the ROI, which will lead to accuracy improvement.
+Unfortunately, another problem occurs if we do that:
+if the rectangular ROI is near the border, a describing square will probably go
+out of the frame --- that leads to errors of the landmarks detector.
+To aviod such a mistake, we have to implement an algorithm that, firstly,
+describes every rectangle by a square, then counts the farthest coordinates
+turned up to be outside of the frame and, finally, pads the source image by
+borders (e.g. single-colored) with the size counted. It will be safe to take
+square ROIs for the facial landmarks detector after that frame adjustment.
+
+### Research for the best parameters (used in GaussianBlur() or unsharpMask(), etc.)
+
+### Parameters autoscaling
+
+-->
--- a/doc/tutorials/gapi/face_beautification/pics/example.jpg
+++ b/doc/tutorials/gapi/face_beautification/pics/example.jpg
--- a/doc/tutorials/gapi/table_of_content_gapi.markdown
+++ b/doc/tutorials/gapi/table_of_content_gapi.markdown
@ -29,3 +29,14 @@ how G-API module can be used for that.
    is ported on G-API, covering the basic intuition behind this
    transition process, and examining benefits which a graph model
    brings there.
+
+- @subpage tutorial_gapi_face_beautification
+
+    *Languages:* C++
+
+    *Compatibility:* \> OpenCV 4.2
+
+    *Author:* Orest Chura
+
+    In this tutorial we build a complex hybrid Computer Vision/Deep
+    Learning video processing pipeline with G-API.
--- a/modules/gapi/src/api/render_ocv.cpp
+++ b/modules/gapi/src/api/render_ocv.cpp
@ -197,7 +197,7 @@ void drawPrimitivesOCV(cv::Mat& in,
                const auto& ftp  = cv::util::get<FText>(p);
                const auto color = converter.cvtColor(ftp.color);

-                GAPI_Assert(ftpr && "I must pass cv::gapi::wip::draw::freetype_font"
+                GAPI_Assert(ftpr && "You must pass cv::gapi::wip::draw::freetype_font"
                                    " to the graph compile arguments");
                int baseline = 0;
                auto size    = ftpr->getTextSize(ftp.text, ftp.fh, &baseline);
--- a/samples/cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp
+++ b/samples/cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp
@ -4,6 +4,9 @@
 //
 // Copyright (C) 2018-2019 Intel Corporation

+#include "opencv2/opencv_modules.hpp"
+#if defined(HAVE_OPENCV_GAPI)
+
 #include <opencv2/gapi.hpp>
 #include <opencv2/gapi/core.hpp>
 #include <opencv2/gapi/imgproc.hpp>
@ -11,16 +14,31 @@
 #include <opencv2/gapi/infer.hpp>
 #include <opencv2/gapi/infer/ie.hpp>
 #include <opencv2/gapi/cpu/gcpukernel.hpp>
-#include "opencv2/gapi/streaming/cap.hpp"
+#include <opencv2/gapi/streaming/cap.hpp>

-#include <opencv2/videoio.hpp>
-#include <opencv2/highgui.hpp>
-#include <iomanip>
+#include <opencv2/highgui.hpp> // windows

 namespace config
 {
 constexpr char       kWinFaceBeautification[] = "FaceBeautificator";
 constexpr char       kWinInput[]              = "Input";
+constexpr char       kParserAbout[]           =
+        "Use this script to run the face beautification algorithm with G-API.";
+constexpr char       kParserOptions[]         =
+"{ help         h ||      print the help message. }"
+
+"{ facepath     f ||      a path to a Face detection model file (.xml).}"
+"{ facedevice     |GPU|   the face detection computation device.}"
+
+"{ landmpath    l ||      a path to a Landmarks detection model file (.xml).}"
+"{ landmdevice    |CPU|   the landmarks detection computation device.}"
+
+"{ input        i ||      a path to an input. Skip to capture from a camera.}"
+"{ boxes        b |false| set true to draw face Boxes in the \"Input\" window.}"
+"{ landmarks    m |false| set true to draw landMarks in the \"Input\" window.}"
+"{ streaming    s |true|  set false to disable stream pipelining.}"
+"{ performance  p |false| set true to disable output displaying.}";
+
 const     cv::Scalar kClrWhite (255, 255, 255);
 const     cv::Scalar kClrGreen (  0, 255,   0);
 const     cv::Scalar kClrYellow(  0, 255, 255);
@ -36,13 +54,13 @@ constexpr int        kUnshSigma    = 3;
 constexpr float      kUnshStrength = 0.7f;
 constexpr int        kAngDelta     = 1;
 constexpr bool       kClosedLine   = true;
-
-const size_t kNumPointsInHalfEllipse = 180 / config::kAngDelta + 1;
 } // namespace config

 namespace
 {
+//! [vec_ROI]
 using VectorROI = std::vector<cv::Rect>;
+//! [vec_ROI]
 using GArrayROI = cv::GArray<cv::Rect>;
 using Contour   = std::vector<cv::Point>;
 using Landmarks = std::vector<cv::Point>;
@ -54,10 +72,35 @@ template<typename Tp> inline int toIntRounded(const Tp x)
    return static_cast<int>(std::lround(x));
 }

+//! [toDbl]
 template<typename Tp> inline double toDouble(const Tp x)
 {
    return static_cast<double>(x);
 }
+//! [toDbl]
+
+struct Avg {
+       struct Elapsed {
+           explicit Elapsed(double ms) : ss(ms / 1000.),
+                                         mm(toIntRounded(ss / 60)) {}
+           const double ss;
+           const int    mm;
+       };
+
+       using MS = std::chrono::duration<double, std::ratio<1, 1000>>;
+       using TS = std::chrono::time_point<std::chrono::high_resolution_clock>;
+       TS started;
+
+       void    start() { started = now(); }
+       TS      now() const { return std::chrono::high_resolution_clock::now(); }
+       double  tick() const { return std::chrono::duration_cast<MS>(now() - started).count(); }
+       Elapsed elapsed() const { return Elapsed{tick()}; }
+       double  fps(std::size_t n) const { return static_cast<double>(n) / (tick() / 1000.); }
+   };
+std::ostream& operator<<(std::ostream &os, const Avg::Elapsed &e) {
+   os << e.mm << ':' << (e.ss - 60*e.mm);
+   return os;
+}

 std::string getWeightsPath(const std::string &mdlXMLPath) // mdlXMLPath =
                                                          // "The/Full/Path.xml"
@ -77,31 +120,28 @@ namespace custom
 {
 using TplPtsFaceElements_Jaw = std::tuple<cv::GArray<Landmarks>,
                                          cv::GArray<Contour>>;
-using TplFaces_FaceElements  = std::tuple<cv::GArray<Contour>,
-                                          cv::GArray<Contour>>;

 // Wrapper-functions
 inline int getLineInclinationAngleDegrees(const cv::Point &ptLeft,
                                          const cv::Point &ptRight);
 inline Contour getForeheadEllipse(const cv::Point &ptJawLeft,
                                  const cv::Point &ptJawRight,
-                                  const cv::Point &ptJawMiddle,
-                                  const size_t     capacity);
+                                  const cv::Point &ptJawMiddle);
 inline Contour getEyeEllipse(const cv::Point &ptLeft,
-                             const cv::Point &ptRight,
-                             const size_t     capacity);
+                             const cv::Point &ptRight);
 inline Contour getPatchedEllipse(const cv::Point &ptLeft,
                                 const cv::Point &ptRight,
                                 const cv::Point &ptUp,
                                 const cv::Point &ptDown);

 // Networks
+//! [net_decl]
 G_API_NET(FaceDetector,  <cv::GMat(cv::GMat)>, "face_detector");
 G_API_NET(LandmDetector, <cv::GMat(cv::GMat)>, "landm_detector");
+//! [net_decl]

 // Function kernels
-G_TYPED_KERNEL(GBilatFilter,
-               <cv::GMat(cv::GMat,int,double,double)>,
+G_TYPED_KERNEL(GBilatFilter, <cv::GMat(cv::GMat,int,double,double)>,
               "custom.faceb12n.bilateralFilter")
 {
    static cv::GMatDesc outMeta(cv::GMatDesc in, int,double,double)
@ -110,8 +150,7 @@ G_TYPED_KERNEL(GBilatFilter,
    }
 };

-G_TYPED_KERNEL(GLaplacian,
-               <cv::GMat(cv::GMat,int)>,
+G_TYPED_KERNEL(GLaplacian, <cv::GMat(cv::GMat,int)>,
               "custom.faceb12n.Laplacian")
 {
    static cv::GMatDesc outMeta(cv::GMatDesc in, int)
@ -120,8 +159,7 @@ G_TYPED_KERNEL(GLaplacian,
    }
 };

-G_TYPED_KERNEL(GFillPolyGContours,
-               <cv::GMat(cv::GMat,cv::GArray<Contour>)>,
+G_TYPED_KERNEL(GFillPolyGContours, <cv::GMat(cv::GMat,cv::GArray<Contour>)>,
               "custom.faceb12n.fillPolyGContours")
 {
    static cv::GMatDesc outMeta(cv::GMatDesc in, cv::GArrayDesc)
@ -130,8 +168,8 @@ G_TYPED_KERNEL(GFillPolyGContours,
    }
 };

-G_TYPED_KERNEL(GPolyLines,
-               <cv::GMat(cv::GMat,cv::GArray<Contour>,bool,cv::Scalar)>,
+G_TYPED_KERNEL(GPolyLines, <cv::GMat(cv::GMat,cv::GArray<Contour>,bool,
+                                     cv::Scalar)>,
               "custom.faceb12n.polyLines")
 {
    static cv::GMatDesc outMeta(cv::GMatDesc in, cv::GArrayDesc,bool,cv::Scalar)
@ -140,8 +178,7 @@ G_TYPED_KERNEL(GPolyLines,
    }
 };

-G_TYPED_KERNEL(GRectangle,
-               <cv::GMat(cv::GMat,GArrayROI,cv::Scalar)>,
+G_TYPED_KERNEL(GRectangle, <cv::GMat(cv::GMat,GArrayROI,cv::Scalar)>,
               "custom.faceb12n.rectangle")
 {
    static cv::GMatDesc outMeta(cv::GMatDesc in, cv::GArrayDesc,cv::Scalar)
@ -150,8 +187,7 @@ G_TYPED_KERNEL(GRectangle,
    }
 };

-G_TYPED_KERNEL(GFacePostProc,
-               <GArrayROI(cv::GMat,cv::GMat,float)>,
+G_TYPED_KERNEL(GFacePostProc, <GArrayROI(cv::GMat,cv::GMat,float)>,
               "custom.faceb12n.faceDetectPostProc")
 {
    static cv::GArrayDesc outMeta(const cv::GMatDesc&,const cv::GMatDesc&,float)
@ -160,8 +196,8 @@ G_TYPED_KERNEL(GFacePostProc,
    }
 };

-G_TYPED_KERNEL_M(GLandmPostProc,
-                 <TplPtsFaceElements_Jaw(cv::GArray<cv::GMat>,GArrayROI)>,
+G_TYPED_KERNEL_M(GLandmPostProc, <TplPtsFaceElements_Jaw(cv::GArray<cv::GMat>,
+                                                         GArrayROI)>,
                 "custom.faceb12n.landmDetectPostProc")
 {
    static std::tuple<cv::GArrayDesc,cv::GArrayDesc> outMeta(
@ -171,17 +207,17 @@ G_TYPED_KERNEL_M(GLandmPostProc,
    }
 };

-G_TYPED_KERNEL_M(GGetContours,
-                 <TplFaces_FaceElements(cv::GArray<Landmarks>,
-                                        cv::GArray<Contour>)>,
+//! [kern_m_decl]
+using TplFaces_FaceElements  = std::tuple<cv::GArray<Contour>, cv::GArray<Contour>>;
+G_TYPED_KERNEL_M(GGetContours, <TplFaces_FaceElements (cv::GArray<Landmarks>, cv::GArray<Contour>)>,
                 "custom.faceb12n.getContours")
 {
-    static std::tuple<cv::GArrayDesc,cv::GArrayDesc> outMeta(
-                const cv::GArrayDesc&,const cv::GArrayDesc&)
+    static std::tuple<cv::GArrayDesc,cv::GArrayDesc> outMeta(const cv::GArrayDesc&,const cv::GArrayDesc&)
    {
        return std::make_tuple(cv::empty_array_desc(), cv::empty_array_desc());
    }
 };
+//! [kern_m_decl]


 // OCV_Kernels
@ -262,11 +298,12 @@ GAPI_OCV_KERNEL(GCPURectangle, custom::GRectangle)
 // A face detector outputs a blob with the shape: [1, 1, N, 7], where N is
 //  the number of detected bounding boxes. Structure of an output for every
 //  detected face is the following:
-//  [image_id, label, conf, x_min, y_min, x_max, y_max]; all the seven elements
+//  [image_id, label, conf, x_min, y_min, x_max, y_max], all the seven elements
 //  are floating point. For more details please visit:
-//  https://github.com/opencv/open_model_zoo/blob/master/intel_models/face-detection-adas-0001
+// https://github.com/opencv/open_model_zoo/blob/master/intel_models/face-detection-adas-0001
 // This kernel is the face detection output blob parsing that returns a vector
 //  of detected faces' rects:
+//! [fd_pp]
 GAPI_OCV_KERNEL(GCPUFacePostProc, GFacePostProc)
 {
    static void run(const cv::Mat   &inDetectResult,
@ -289,12 +326,17 @@ GAPI_OCV_KERNEL(GCPUFacePostProc, GFacePostProc)
                break;
            }
            const float faceConfidence = data[i * kObjectSize + 2];
+            // We can cut detections by the `conf` field
+            //  to avoid mistakes of the detector.
            if (faceConfidence > faceConfThreshold)
            {
                const float left   = data[i * kObjectSize + 3];
                const float top    = data[i * kObjectSize + 4];
                const float right  = data[i * kObjectSize + 5];
                const float bottom = data[i * kObjectSize + 6];
+                // These are normalized coordinates and are between 0 and 1;
+                //  to get the real pixel coordinates we should multiply it by
+                //  the image sizes respectively to the directions:
                cv::Point tl(toIntRounded(left   * imgCols),
                             toIntRounded(top    * imgRows));
                cv::Point br(toIntRounded(right  * imgCols),
@ -304,10 +346,18 @@ GAPI_OCV_KERNEL(GCPUFacePostProc, GFacePostProc)
        }
    }
 };
+//! [fd_pp]

 // This kernel is the facial landmarks detection output Mat parsing for every
 //  detected face; returns a tuple containing a vector of vectors of
 //  face elements' Points and a vector of vectors of jaw's Points:
+// There are 35 landmarks given by the default detector for each face
+//  in a frame; the first 18 of them are face elements (eyes, eyebrows,
+//  a nose, a mouth) and the last 17 - a jaw contour. The detector gives
+//  floating point values for landmarks' normed coordinates relatively
+//  to an input ROI (not the original frame).
+//  For more details please visit:
+// https://github.com/opencv/open_model_zoo/blob/master/intel_models/facial-landmarks-35-adas-0002
 GAPI_OCV_KERNEL(GCPULandmPostProc, GLandmPostProc)
 {
    static void run(const std::vector<cv::Mat>   &vctDetectResults,
@ -315,13 +365,6 @@ GAPI_OCV_KERNEL(GCPULandmPostProc, GLandmPostProc)
                          std::vector<Landmarks> &vctPtsFaceElems,
                          std::vector<Contour>   &vctCntJaw)
    {
-        // There are 35 landmarks given by the default detector for each face
-        //  in a frame; the first 18 of them are face elements (eyes, eyebrows,
-        //  a nose, a mouth) and the last 17 - a jaw contour. The detector gives
-        //  floating point values for landmarks' normed coordinates relatively
-        //  to an input ROI (not the original frame).
-        //  For more details please visit:
-        //  https://github.com/opencv/open_model_zoo/blob/master/intel_models/facial-landmarks-35-adas-0002
        static constexpr int kNumFaceElems = 18;
        static constexpr int kNumTotal     = 35;
        const size_t numFaces = vctRects.size();
@ -342,10 +385,8 @@ GAPI_OCV_KERNEL(GCPULandmPostProc, GLandmPostProc)
            ptsFaceElems.clear();
            for (int j = 0; j < kNumFaceElems * 2; j += 2)
            {
-                cv::Point pt =
-                        cv::Point(toIntRounded(data[j]   * vctRects[i].width),
-                                  toIntRounded(data[j+1] * vctRects[i].height))
-                        + vctRects[i].tl();
+                cv::Point pt = cv::Point(toIntRounded(data[j]   * vctRects[i].width),
+                                         toIntRounded(data[j+1] * vctRects[i].height)) + vctRects[i].tl();
                ptsFaceElems.push_back(pt);
            }
            vctPtsFaceElems.push_back(ptsFaceElems);
@ -354,10 +395,8 @@ GAPI_OCV_KERNEL(GCPULandmPostProc, GLandmPostProc)
            cntJaw.clear();
            for(int j = kNumFaceElems * 2; j < kNumTotal * 2; j += 2)
            {
-                cv::Point pt =
-                        cv::Point(toIntRounded(data[j]   * vctRects[i].width),
-                                  toIntRounded(data[j+1] * vctRects[i].height))
-                        + vctRects[i].tl();
+                cv::Point pt = cv::Point(toIntRounded(data[j]   * vctRects[i].width),
+                                         toIntRounded(data[j+1] * vctRects[i].height)) + vctRects[i].tl();
                cntJaw.push_back(pt);
            }
            vctCntJaw.push_back(cntJaw);
@ -368,23 +407,24 @@ GAPI_OCV_KERNEL(GCPULandmPostProc, GLandmPostProc)
 // This kernel is the facial landmarks detection post-processing for every face
 //  detected before; output is a tuple of vectors of detected face contours and
 //  facial elements contours:
+//! [ld_pp_cnts]
+//! [kern_m_impl]
 GAPI_OCV_KERNEL(GCPUGetContours, GGetContours)
 {
-    static void run(const std::vector<Landmarks> &vctPtsFaceElems,
-                    const std::vector<Contour>   &vctCntJaw,
+    static void run(const std::vector<Landmarks> &vctPtsFaceElems,  // 18 landmarks of the facial elements
+                    const std::vector<Contour>   &vctCntJaw,        // 17 landmarks of a jaw
                          std::vector<Contour>   &vctElemsContours,
                          std::vector<Contour>   &vctFaceContours)
    {
+//! [kern_m_impl]
        size_t numFaces = vctCntJaw.size();
        CV_Assert(numFaces == vctPtsFaceElems.size());
        CV_Assert(vctElemsContours.size() == 0ul);
        CV_Assert(vctFaceContours.size()  == 0ul);
        // vctFaceElemsContours will store all the face elements' contours found
-        //  on an input image, namely 4 elements (two eyes, nose, mouth)
-        //  for every detected face
+        //  in an input image, namely 4 elements (two eyes, nose, mouth) for every detected face:
        vctElemsContours.reserve(numFaces * 4);
-        // vctFaceElemsContours will store all the faces' contours found on
-        //  an input image
+        // vctFaceElemsContours will store all the faces' contours found in an input image:
        vctFaceContours.reserve(numFaces);

        Contour cntFace, cntLeftEye, cntRightEye, cntNose, cntMouth;
@ -393,63 +433,47 @@ GAPI_OCV_KERNEL(GCPUGetContours, GGetContours)
        for (size_t i = 0ul; i < numFaces; i++)
        {
            // The face elements contours
+
            // A left eye:
-            // Approximating the lower eye contour by half-ellipse
-            //  (using eye points) and storing in cntLeftEye:
-            cntLeftEye = getEyeEllipse(vctPtsFaceElems[i][1],
-                                       vctPtsFaceElems[i][0],
-                                       config::kNumPointsInHalfEllipse + 3);
+            // Approximating the lower eye contour by half-ellipse (using eye points) and storing in cntLeftEye:
+            cntLeftEye = getEyeEllipse(vctPtsFaceElems[i][1], vctPtsFaceElems[i][0]);
            // Pushing the left eyebrow clock-wise:
-            cntLeftEye.insert(cntLeftEye.cend(), {vctPtsFaceElems[i][12],
-                                                  vctPtsFaceElems[i][13],
+            cntLeftEye.insert(cntLeftEye.cend(), {vctPtsFaceElems[i][12], vctPtsFaceElems[i][13],
                                                  vctPtsFaceElems[i][14]});
+
            // A right eye:
-            // Approximating the lower eye contour by half-ellipse
-            //  (using eye points) and storing in vctRightEye:
-            cntRightEye = getEyeEllipse(vctPtsFaceElems[i][2],
-                                        vctPtsFaceElems[i][3],
-                                        config::kNumPointsInHalfEllipse + 3);
+            // Approximating the lower eye contour by half-ellipse (using eye points) and storing in vctRightEye:
+            cntRightEye = getEyeEllipse(vctPtsFaceElems[i][2], vctPtsFaceElems[i][3]);
            // Pushing the right eyebrow clock-wise:
-            cntRightEye.insert(cntRightEye.cend(), {vctPtsFaceElems[i][15],
-                                                    vctPtsFaceElems[i][16],
+            cntRightEye.insert(cntRightEye.cend(), {vctPtsFaceElems[i][15], vctPtsFaceElems[i][16],
                                                    vctPtsFaceElems[i][17]});
+
            // A nose:
            // Storing the nose points clock-wise
            cntNose.clear();
-            cntNose.insert(cntNose.cend(), {vctPtsFaceElems[i][4],
-                                            vctPtsFaceElems[i][7],
-                                            vctPtsFaceElems[i][5],
-                                            vctPtsFaceElems[i][6]});
+            cntNose.insert(cntNose.cend(), {vctPtsFaceElems[i][4], vctPtsFaceElems[i][7],
+                                            vctPtsFaceElems[i][5], vctPtsFaceElems[i][6]});
+
            // A mouth:
-            // Approximating the mouth contour by two half-ellipses
-            //  (using mouth points) and storing in vctMouth:
-            cntMouth = getPatchedEllipse(vctPtsFaceElems[i][8],
-                                         vctPtsFaceElems[i][9],
-                                         vctPtsFaceElems[i][10],
-                                         vctPtsFaceElems[i][11]);
+            // Approximating the mouth contour by two half-ellipses (using mouth points) and storing in vctMouth:
+            cntMouth = getPatchedEllipse(vctPtsFaceElems[i][8], vctPtsFaceElems[i][9],
+                                         vctPtsFaceElems[i][10], vctPtsFaceElems[i][11]);
+
            // Storing all the elements in a vector:
-            vctElemsContours.insert(vctElemsContours.cend(), {cntLeftEye,
-                                                              cntRightEye,
-                                                              cntNose,
-                                                              cntMouth});
+            vctElemsContours.insert(vctElemsContours.cend(), {cntLeftEye, cntRightEye, cntNose, cntMouth});

            // The face contour:
-            // Approximating the forehead contour by half-ellipse
-            //  (using jaw points) and storing in vctFace:
-            cntFace = getForeheadEllipse(vctCntJaw[i][0], vctCntJaw[i][16],
-                                         vctCntJaw[i][8],
-                                         config::kNumPointsInHalfEllipse +
-                                            vctCntJaw[i].size());
-            // The ellipse is drawn clock-wise, but jaw contour points goes
-            //  vice versa, so it's necessary to push cntJaw from the end
-            //  to the begin using a reverse iterator:
-            std::copy(vctCntJaw[i].crbegin(), vctCntJaw[i].crend(),
-                      std::back_inserter(cntFace));
+            // Approximating the forehead contour by half-ellipse (using jaw points) and storing in vctFace:
+            cntFace = getForeheadEllipse(vctCntJaw[i][0], vctCntJaw[i][16], vctCntJaw[i][8]);
+            // The ellipse is drawn clock-wise, but jaw contour points goes vice versa, so it's necessary to push
+            //  cntJaw from the end to the begin using a reverse iterator:
+            std::copy(vctCntJaw[i].crbegin(), vctCntJaw[i].crend(), std::back_inserter(cntFace));
            // Storing the face contour in another vector:
            vctFaceContours.push_back(cntFace);
        }
    }
 };
+//! [ld_pp_cnts]

 // GAPI subgraph functions
 inline cv::GMat unsharpMask(const cv::GMat &src,
@ -463,27 +487,26 @@ inline cv::GMat mask3C(const cv::GMat &src,
 // Functions implementation:
 // Returns an angle (in degrees) between a line given by two Points and
 //  the horison. Note that the result depends on the arguments order:
-inline int custom::getLineInclinationAngleDegrees(const cv::Point &ptLeft,
-                                                  const cv::Point &ptRight)
+//! [ld_pp_incl]
+inline int custom::getLineInclinationAngleDegrees(const cv::Point &ptLeft, const cv::Point &ptRight)
 {
    const cv::Point residual = ptRight - ptLeft;
    if (residual.y == 0 && residual.x == 0)
        return 0;
    else
-        return toIntRounded(atan2(toDouble(residual.y), toDouble(residual.x))
-                                * 180.0 / M_PI);
+        return toIntRounded(atan2(toDouble(residual.y), toDouble(residual.x)) * 180.0 / CV_PI);
 }
+//! [ld_pp_incl]

 // Approximates a forehead by half-ellipse using jaw points and some geometry
 //  and then returns points of the contour; "capacity" is used to reserve enough
 //  memory as there will be other points inserted.
+//! [ld_pp_fhd]
 inline Contour custom::getForeheadEllipse(const cv::Point &ptJawLeft,
                                          const cv::Point &ptJawRight,
-                                          const cv::Point &ptJawLower,
-                                          const size_t     capacity = 0)
+                                          const cv::Point &ptJawLower)
 {
    Contour cntForehead;
-    cntForehead.reserve(std::max(capacity, config::kNumPointsInHalfEllipse));
    // The point amid the top two points of a jaw:
    const cv::Point ptFaceCenter((ptJawLeft + ptJawRight) / 2);
    // This will be the center of the ellipse.
@ -505,21 +528,18 @@ inline Contour custom::getForeheadEllipse(const cv::Point &ptJawLeft,
    // We need the upper part of an ellipse:
    static constexpr int kAngForeheadStart = 180;
    static constexpr int kAngForeheadEnd   = 360;
-    cv::ellipse2Poly(ptFaceCenter, cv::Size(axisX, axisY), angFace,
-                     kAngForeheadStart, kAngForeheadEnd, config::kAngDelta,
-                     cntForehead);
+    cv::ellipse2Poly(ptFaceCenter, cv::Size(axisX, axisY), angFace, kAngForeheadStart, kAngForeheadEnd,
+                     config::kAngDelta, cntForehead);
    return cntForehead;
 }
+//! [ld_pp_fhd]

 // Approximates the lower eye contour by half-ellipse using eye points and some
-//  geometry and then returns points of the contour; "capacity" is used
-//  to reserve enough memory as there will be other points inserted.
-inline Contour custom::getEyeEllipse(const cv::Point &ptLeft,
-                                     const cv::Point &ptRight,
-                                     const size_t     capacity = 0)
+//  geometry and then returns points of the contour.
+//! [ld_pp_eye]
+inline Contour custom::getEyeEllipse(const cv::Point &ptLeft, const cv::Point &ptRight)
 {
    Contour cntEyeBottom;
-    cntEyeBottom.reserve(std::max(capacity, config::kNumPointsInHalfEllipse));
    const cv::Point ptEyeCenter((ptRight + ptLeft) / 2);
    const int angle = getLineInclinationAngleDegrees(ptLeft, ptRight);
    const int axisX = toIntRounded(cv::norm(ptRight - ptLeft) / 2.0);
@ -529,10 +549,11 @@ inline Contour custom::getEyeEllipse(const cv::Point &ptLeft,
    // We need the lower part of an ellipse:
    static constexpr int kAngEyeStart = 0;
    static constexpr int kAngEyeEnd   = 180;
-    cv::ellipse2Poly(ptEyeCenter, cv::Size(axisX, axisY), angle, kAngEyeStart,
-                     kAngEyeEnd, config::kAngDelta, cntEyeBottom);
+    cv::ellipse2Poly(ptEyeCenter, cv::Size(axisX, axisY), angle, kAngEyeStart, kAngEyeEnd, config::kAngDelta,
+                     cntEyeBottom);
    return cntEyeBottom;
 }
+//! [ld_pp_eye]

 //This function approximates an object (a mouth) by two half-ellipses using
 //  4 points of the axes' ends and then returns points of the contour:
@ -552,8 +573,7 @@ inline Contour custom::getPatchedEllipse(const cv::Point &ptLeft,
    // We need the upper part of an ellipse:
    static constexpr int angTopStart = 180;
    static constexpr int angTopEnd   = 360;
-    cv::ellipse2Poly(ptMouthCenter, cv::Size(axisX, axisYTop), angMouth,
-                     angTopStart, angTopEnd, config::kAngDelta, cntMouthTop);
+    cv::ellipse2Poly(ptMouthCenter, cv::Size(axisX, axisYTop), angMouth, angTopStart, angTopEnd, config::kAngDelta, cntMouthTop);

    // The bottom half-ellipse:
    Contour cntMouth;
@ -561,16 +581,14 @@ inline Contour custom::getPatchedEllipse(const cv::Point &ptLeft,
    // We need the lower part of an ellipse:
    static constexpr int angBotStart = 0;
    static constexpr int angBotEnd   = 180;
-    cv::ellipse2Poly(ptMouthCenter, cv::Size(axisX, axisYBot), angMouth,
-                     angBotStart, angBotEnd, config::kAngDelta, cntMouth);
+    cv::ellipse2Poly(ptMouthCenter, cv::Size(axisX, axisYBot), angMouth, angBotStart, angBotEnd, config::kAngDelta, cntMouth);

    // Pushing the upper part to vctOut
-    cntMouth.reserve(cntMouth.size() + cntMouthTop.size());
-    std::copy(cntMouthTop.cbegin(), cntMouthTop.cend(),
-              std::back_inserter(cntMouth));
+    std::copy(cntMouthTop.cbegin(), cntMouthTop.cend(), std::back_inserter(cntMouth));
    return cntMouth;
 }

+//! [unsh]
 inline cv::GMat custom::unsharpMask(const cv::GMat &src,
                                    const int       sigma,
                                    const float     strength)
@ -579,6 +597,7 @@ inline cv::GMat custom::unsharpMask(const cv::GMat &src,
    cv::GMat laplacian = custom::GLaplacian::on(blurred, CV_8U);
    return (src - (laplacian * strength));
 }
+//! [unsh]

 inline cv::GMat custom::mask3C(const cv::GMat &src,
                               const cv::GMat &mask)
@ -593,30 +612,17 @@ inline cv::GMat custom::mask3C(const cv::GMat &src,

 int main(int argc, char** argv)
 {
-    cv::CommandLineParser parser(argc, argv,
-"{ help         h ||      print the help message. }"
+    cv::namedWindow(config::kWinFaceBeautification, cv::WINDOW_NORMAL);
+    cv::namedWindow(config::kWinInput,              cv::WINDOW_NORMAL);

-"{ facepath     f ||      a path to a Face detection model file (.xml).}"
-"{ facedevice     |GPU|   the face detection computation device.}"
-
-"{ landmpath    l ||      a path to a Landmarks detection model file (.xml).}"
-"{ landmdevice    |CPU|   the landmarks detection computation device.}"
-
-"{ input        i ||      a path to an input. Skip to capture from a camera.}"
-"{ boxes        b |false| set true to draw face Boxes in the \"Input\" window.}"
-"{ landmarks    m |false| set true to draw landMarks in the \"Input\" window.}"
-    );
-    parser.about("Use this script to run the face beautification"
-                 " algorithm on G-API.");
+    cv::CommandLineParser parser(argc, argv, config::kParserOptions);
+    parser.about(config::kParserAbout);
    if (argc == 1 || parser.has("help"))
    {
        parser.printMessage();
        return 0;
    }

-    cv::namedWindow(config::kWinFaceBeautification, cv::WINDOW_NORMAL);
-    cv::namedWindow(config::kWinInput,              cv::WINDOW_NORMAL);
-
    // Parsing input arguments
    const std::string faceXmlPath = parser.get<std::string>("facepath");
    const std::string faceBinPath = getWeightsPath(faceXmlPath);
@ -626,59 +632,47 @@ int main(int argc, char** argv)
    const std::string landmBinPath = getWeightsPath(landmXmlPath);
    const std::string landmDevice  = parser.get<std::string>("landmdevice");

-    // The flags for drawing/not drawing face boxes or/and landmarks in the
-    //  \"Input\" window:
-    const bool flgBoxes     = parser.get<bool>("boxes");
-    const bool flgLandmarks = parser.get<bool>("landmarks");
-    // To provide this opportunity, it is necessary to check the flags when
-    //  compiling a graph
-
    // Declaring a graph
-    // Streaming-API version of a pipeline expression with a lambda-based
+    // The version of a pipeline expression with a lambda-based
    //  constructor is used to keep all temporary objects in a dedicated scope.
+//! [ppl]
    cv::GComputation pipeline([=]()
    {
-        cv::GMat  gimgIn;
-        // Infering
+//! [net_usg_fd]
+        cv::GMat  gimgIn;                                                                           // input
+
        cv::GMat  faceOut  = cv::gapi::infer<custom::FaceDetector>(gimgIn);
-        GArrayROI garRects = custom::GFacePostProc::on(faceOut, gimgIn,
-                                                       config::kConfThresh);
-        cv::GArray<Landmarks> garElems;
-        cv::GArray<Contour>   garJaws;
-        cv::GArray<cv::GMat> landmOut  = cv::gapi::infer<custom::LandmDetector>(
-                                                         garRects, gimgIn);
-        std::tie(garElems, garJaws)    = custom::GLandmPostProc::on(landmOut,
-                                                                 garRects);
-        cv::GArray<Contour> garElsConts;
-        cv::GArray<Contour> garFaceConts;
-        std::tie(garElsConts, garFaceConts) = custom::GGetContours::on(garElems,
-                                                                       garJaws);
-        // Masks drawing
-        // All masks are created as CV_8UC1
-        cv::GMat mskSharp        = custom::GFillPolyGContours::on(gimgIn,
-                                                                  garElsConts);
-        cv::GMat mskSharpG       = cv::gapi::gaussianBlur(mskSharp,
-                                                          config::kGKernelSize,
-                                                          config::kGSigma);
-        cv::GMat mskBlur         = custom::GFillPolyGContours::on(gimgIn,
-                                                                  garFaceConts);
-        cv::GMat mskBlurG        = cv::gapi::gaussianBlur(mskBlur,
-                                                          config::kGKernelSize,
-                                                          config::kGSigma);
-        // The first argument in mask() is Blur as we want to subtract from
-        // BlurG the next step:
-        cv::GMat mskBlurFinal    = mskBlurG - cv::gapi::mask(mskBlurG,
-                                                             mskSharpG);
-        cv::GMat mskFacesGaussed = mskBlurFinal + mskSharpG;
-        cv::GMat mskFacesWhite   = cv::gapi::threshold(mskFacesGaussed, 0, 255,
-                                                       cv::THRESH_BINARY);
-        cv::GMat mskNoFaces      = cv::gapi::bitwise_not(mskFacesWhite);
-        cv::GMat gimgBilat       = custom::GBilatFilter::on(gimgIn,
-                                                            config::kBSize,
-                                                            config::kBSigmaCol,
-                                                            config::kBSigmaSp);
-        cv::GMat gimgSharp       = custom::unsharpMask(gimgIn,
-                                                       config::kUnshSigma,
+//! [net_usg_fd]
+        GArrayROI garRects = custom::GFacePostProc::on(faceOut, gimgIn, config::kConfThresh);       // post-proc
+
+//! [net_usg_ld]
+        cv::GArray<cv::GMat> landmOut  = cv::gapi::infer<custom::LandmDetector>(garRects, gimgIn);
+//! [net_usg_ld]
+        cv::GArray<Landmarks> garElems;                                                             // |
+        cv::GArray<Contour>   garJaws;                                                              // |output arrays
+        std::tie(garElems, garJaws)    = custom::GLandmPostProc::on(landmOut, garRects);            // post-proc
+        cv::GArray<Contour> garElsConts;                                                            // face elements
+        cv::GArray<Contour> garFaceConts;                                                           // whole faces
+        std::tie(garElsConts, garFaceConts) = custom::GGetContours::on(garElems, garJaws);          // interpolation
+
+//! [msk_ppline]
+        cv::GMat mskSharp        = custom::GFillPolyGContours::on(gimgIn, garElsConts);             // |
+        cv::GMat mskSharpG       = cv::gapi::gaussianBlur(mskSharp, config::kGKernelSize,           // |
+                                                          config::kGSigma);                         // |
+        cv::GMat mskBlur         = custom::GFillPolyGContours::on(gimgIn, garFaceConts);            // |
+        cv::GMat mskBlurG        = cv::gapi::gaussianBlur(mskBlur, config::kGKernelSize,            // |
+                                                          config::kGSigma);                         // |draw masks
+        // The first argument in mask() is Blur as we want to subtract from                         // |
+        // BlurG the next step:                                                                     // |
+        cv::GMat mskBlurFinal    = mskBlurG - cv::gapi::mask(mskBlurG, mskSharpG);                  // |
+        cv::GMat mskFacesGaussed = mskBlurFinal + mskSharpG;                                        // |
+        cv::GMat mskFacesWhite   = cv::gapi::threshold(mskFacesGaussed, 0, 255, cv::THRESH_BINARY); // |
+        cv::GMat mskNoFaces      = cv::gapi::bitwise_not(mskFacesWhite);                            // |
+//! [msk_ppline]
+
+        cv::GMat gimgBilat       = custom::GBilatFilter::on(gimgIn, config::kBSize,
+                                                            config::kBSigmaCol, config::kBSigmaSp);
+        cv::GMat gimgSharp       = custom::unsharpMask(gimgIn, config::kUnshSigma,
                                                       config::kUnshStrength);
        // Applying the masks
        // Custom function mask3C() should be used instead of just gapi::mask()
@ -686,54 +680,34 @@ int main(int argc, char** argv)
        cv::GMat gimgBilatMasked = custom::mask3C(gimgBilat, mskBlurFinal);
        cv::GMat gimgSharpMasked = custom::mask3C(gimgSharp, mskSharpG);
        cv::GMat gimgInMasked    = custom::mask3C(gimgIn,    mskNoFaces);
-        cv::GMat gimgBeautif     = gimgBilatMasked + gimgSharpMasked +
-                                   gimgInMasked;
-        // Drawing face boxes and landmarks if necessary:
-        cv::GMat gimgTemp;
-        if (flgLandmarks == true)
-        {
-            cv::GMat gimgTemp2 = custom::GPolyLines::on(gimgIn, garFaceConts,
-                                                        config::kClosedLine,
-                                                        config::kClrYellow);
-                     gimgTemp  = custom::GPolyLines::on(gimgTemp2, garElsConts,
-                                                        config::kClosedLine,
-                                                        config::kClrYellow);
-        }
-        else
-        {
-            gimgTemp = gimgIn;
-        }
-        cv::GMat gimgShow;
-        if (flgBoxes == true)
-        {
-            gimgShow = custom::GRectangle::on(gimgTemp, garRects,
-                                              config::kClrGreen);
-        }
-        else
-        {
-        // This action is necessary because an output node must be a result of
-        //  some operations applied to an input node, so it handles the case
-        //  when it should be nothing to draw
-            gimgShow = cv::gapi::copy(gimgTemp);
-        }
-        return cv::GComputation(cv::GIn(gimgIn),
-                                cv::GOut(gimgBeautif, gimgShow));
+        cv::GMat gimgBeautif = gimgBilatMasked + gimgSharpMasked + gimgInMasked;
+        return cv::GComputation(cv::GIn(gimgIn), cv::GOut(gimgBeautif,
+                                                          cv::gapi::copy(gimgIn),
+                                                          garFaceConts,
+                                                          garElsConts,
+                                                          garRects));
    });
+//! [ppl]
    // Declaring IE params for networks
+//! [net_param]
    auto faceParams  = cv::gapi::ie::Params<custom::FaceDetector>
    {
-        faceXmlPath,
-        faceBinPath,
-        faceDevice
+        /*std::string*/ faceXmlPath,
+        /*std::string*/ faceBinPath,
+        /*std::string*/ faceDevice
    };
    auto landmParams = cv::gapi::ie::Params<custom::LandmDetector>
    {
-        landmXmlPath,
-        landmBinPath,
-        landmDevice
+        /*std::string*/ landmXmlPath,
+        /*std::string*/ landmBinPath,
+        /*std::string*/ landmDevice
    };
+//! [net_param]
+//! [netw]
    auto networks      = cv::gapi::networks(faceParams, landmParams);
+//! [netw]
    // Declaring custom and fluid kernels have been used:
+//! [kern_pass_1]
    auto customKernels = cv::gapi::kernels<custom::GCPUBilateralFilter,
                                           custom::GCPULaplacian,
                                           custom::GCPUFillPolyGContours,
@ -744,40 +718,188 @@ int main(int argc, char** argv)
                                           custom::GCPUGetContours>();
    auto kernels       = cv::gapi::combine(cv::gapi::core::fluid::kernels(),
                                           customKernels);
+//! [kern_pass_1]
+
+    Avg avg;
+    size_t frames = 0;
+
+    // The flags for drawing/not drawing face boxes or/and landmarks in the
+    //  \"Input\" window:
+    const bool flgBoxes     = parser.get<bool>("boxes");
+    const bool flgLandmarks = parser.get<bool>("landmarks");
+    // The flag to involve stream pipelining:
+    const bool flgStreaming = parser.get<bool>("streaming");
+    // The flag to display the output images or not:
+    const bool flgPerformance = parser.get<bool>("performance");
    // Now we are ready to compile the pipeline to a stream with specified
    //  kernels, networks and image format expected to process
-    auto stream = pipeline.compileStreaming(cv::GMatDesc{CV_8U,3,
-                                                         cv::Size(1280,720)},
-                                            cv::compile_args(kernels,
-                                                             networks));
-    // Setting the source for the stream:
-    if (parser.has("input"))
+    if (flgStreaming == true)
    {
-        stream.setSource(cv::gapi::wip::make_src<cv::gapi::wip::GCaptureSource>
-                         (parser.get<cv::String>("input")));
-    }
-    else
-    {
-        stream.setSource(cv::gapi::wip::make_src<cv::gapi::wip::GCaptureSource>
-                         (0));
-    }
-    // Declaring output variables
-    cv::Mat imgShow;
-    cv::Mat imgBeautif;
-    // Streaming:
-    stream.start();
-    while (stream.running())
-    {
-        auto out_vector = cv::gout(imgBeautif, imgShow);
-        if (!stream.try_pull(std::move(out_vector)))
+//! [str_comp]
+        cv::GStreamingCompiled stream = pipeline.compileStreaming(cv::compile_args(kernels, networks));
+//! [str_comp]
+        // Setting the source for the stream:
+//! [str_src]
+        if (parser.has("input"))
        {
-            // Use a try_pull() to obtain data.
-            // If there's no data, let UI refresh (and handle keypress)
-            if (cv::waitKey(1) >= 0) break;
-            else continue;
+            stream.setSource(cv::gapi::wip::make_src<cv::gapi::wip::GCaptureSource>(parser.get<cv::String>("input")));
        }
-        cv::imshow(config::kWinInput,              imgShow);
-        cv::imshow(config::kWinFaceBeautification, imgBeautif);
+//! [str_src]
+        else
+        {
+            stream.setSource(cv::gapi::wip::make_src<cv::gapi::wip::GCaptureSource>(0));
+        }
+        // Declaring output variables
+        // Streaming:
+        cv::Mat imgShow;
+        cv::Mat imgBeautif;
+        std::vector<Contour> vctFaceConts, vctElsConts;
+        VectorROI vctRects;
+        if (flgPerformance == true)
+        {
+            auto out_vector = cv::gout(imgBeautif, imgShow, vctFaceConts,
+                                       vctElsConts, vctRects);
+            stream.start();
+            avg.start();
+            while (stream.running())
+            {
+                stream.pull(std::move(out_vector));
+                frames++;
+            }
+        }
+        else // flgPerformance == false
+        {
+//! [str_loop]
+            auto out_vector = cv::gout(imgBeautif, imgShow, vctFaceConts,
+                                       vctElsConts, vctRects);
+            stream.start();
+            avg.start();
+            while (stream.running())
+            {
+                if (!stream.try_pull(std::move(out_vector)))
+                {
+                    // Use a try_pull() to obtain data.
+                    // If there's no data, let UI refresh (and handle keypress)
+                    if (cv::waitKey(1) >= 0) break;
+                    else continue;
+                }
+                frames++;
+                // Drawing face boxes and landmarks if necessary:
+                if (flgLandmarks == true)
+                {
+                    cv::polylines(imgShow, vctFaceConts, config::kClosedLine,
+                                  config::kClrYellow);
+                    cv::polylines(imgShow, vctElsConts, config::kClosedLine,
+                                  config::kClrYellow);
+                }
+                if (flgBoxes == true)
+                    for (auto rect : vctRects)
+                        cv::rectangle(imgShow, rect, config::kClrGreen);
+                cv::imshow(config::kWinInput,              imgShow);
+                cv::imshow(config::kWinFaceBeautification, imgBeautif);
+            }
+//! [str_loop]
+        }
+        std::cout << "Processed " << frames << " frames in " << avg.elapsed()
+                  << " (" << avg.fps(frames) << " FPS)" << std::endl;
+    }
+    else // serial mode:
+    {
+//! [bef_cap]
+#include <opencv2/videoio.hpp>
+        cv::GCompiled cc;
+        cv::VideoCapture cap;
+        if (parser.has("input"))
+        {
+            cap.open(parser.get<cv::String>("input"));
+        }
+//! [bef_cap]
+        else if (!cap.open(0))
+        {
+            std::cout << "No input available" << std::endl;
+            return 1;
+        }
+        if (flgPerformance == true)
+        {
+            while (true)
+            {
+                cv::Mat img;
+                cv::Mat imgShow;
+                cv::Mat imgBeautif;
+                std::vector<Contour> vctFaceConts, vctElsConts;
+                VectorROI vctRects;
+                cap >> img;
+                if (img.empty())
+                {
+                   break;
+                }
+                frames++;
+                if (!cc)
+                {
+                    cc = pipeline.compile(cv::descr_of(img), cv::compile_args(kernels, networks));
+                    avg.start();
+                }
+                cc(cv::gin(img), cv::gout(imgBeautif, imgShow, vctFaceConts,
+                                          vctElsConts, vctRects));
+            }
+        }
+        else // flgPerformance == false
+        {
+//! [bef_loop]
+            while (cv::waitKey(1) < 0)
+            {
+                cv::Mat img;
+                cv::Mat imgShow;
+                cv::Mat imgBeautif;
+                std::vector<Contour> vctFaceConts, vctElsConts;
+                VectorROI vctRects;
+                cap >> img;
+                if (img.empty())
+                {
+                   cv::waitKey();
+                   break;
+                }
+                frames++;
+//! [apply]
+                pipeline.apply(cv::gin(img), cv::gout(imgBeautif, imgShow,
+                                                      vctFaceConts,
+                                                      vctElsConts, vctRects),
+                               cv::compile_args(kernels, networks));
+//! [apply]
+                if (frames == 1)
+                {
+                    // Start timer only after 1st frame processed -- compilation
+                    // happens on-the-fly here
+                    avg.start();
+                }
+                // Drawing face boxes and landmarks if necessary:
+                if (flgLandmarks == true)
+                {
+                    cv::polylines(imgShow, vctFaceConts, config::kClosedLine,
+                                  config::kClrYellow);
+                    cv::polylines(imgShow, vctElsConts, config::kClosedLine,
+                                  config::kClrYellow);
+                }
+                if (flgBoxes == true)
+                    for (auto rect : vctRects)
+                        cv::rectangle(imgShow, rect, config::kClrGreen);
+                cv::imshow(config::kWinInput,              imgShow);
+                cv::imshow(config::kWinFaceBeautification, imgBeautif);
+            }
+        }
+//! [bef_loop]
+        std::cout << "Processed " << frames << " frames in " << avg.elapsed()
+                  << " (" << avg.fps(frames) << " FPS)" << std::endl;
    }
    return 0;
 }
+#else
+#include <iostream>
+int main()
+{
+    std::cerr << "This tutorial code requires G-API module "
+                 "with Inference Engine backend to run"
+              << std::endl;
+    return 1;
+}
+#endif  // HAVE_OPECV_GAPI