mirror of
https://github.com/opencv/opencv.git
synced 2025-01-10 22:28:13 +08:00
677 lines
33 KiB
HTML
677 lines
33 KiB
HTML
|
<html>
|
|||
|
|
|||
|
<head>
|
|||
|
<meta http-equiv=Content-Type content="text/html; charset=windows-1251">
|
|||
|
<meta name=Generator content="Microsoft Word 11 (filtered)">
|
|||
|
<title>Object Detection Using Haar-like Features with Cascade of Boosted
|
|||
|
Classifiers</title>
|
|||
|
<style>
|
|||
|
<!--
|
|||
|
/* Style Definitions */
|
|||
|
p.MsoNormal, li.MsoNormal, div.MsoNormal
|
|||
|
{margin:0in;
|
|||
|
margin-bottom:.0001pt;
|
|||
|
text-align:justify;
|
|||
|
font-size:12.0pt;
|
|||
|
font-family:"Times New Roman";}
|
|||
|
h1
|
|||
|
{margin-top:12.0pt;
|
|||
|
margin-right:0in;
|
|||
|
margin-bottom:3.0pt;
|
|||
|
margin-left:0in;
|
|||
|
text-align:justify;
|
|||
|
page-break-after:avoid;
|
|||
|
font-size:16.0pt;
|
|||
|
font-family:Arial;}
|
|||
|
h2
|
|||
|
{margin-top:12.0pt;
|
|||
|
margin-right:0in;
|
|||
|
margin-bottom:3.0pt;
|
|||
|
margin-left:0in;
|
|||
|
text-align:justify;
|
|||
|
page-break-after:avoid;
|
|||
|
font-size:14.0pt;
|
|||
|
font-family:Arial;
|
|||
|
font-style:italic;}
|
|||
|
h3
|
|||
|
{margin-top:12.0pt;
|
|||
|
margin-right:0in;
|
|||
|
margin-bottom:3.0pt;
|
|||
|
margin-left:0in;
|
|||
|
text-align:justify;
|
|||
|
page-break-after:avoid;
|
|||
|
font-size:13.0pt;
|
|||
|
font-family:Arial;}
|
|||
|
span.Typewch
|
|||
|
{font-family:"Courier New";
|
|||
|
font-weight:bold;}
|
|||
|
@page Section1
|
|||
|
{size:595.3pt 841.9pt;
|
|||
|
margin:56.7pt 88.0pt 63.2pt 85.05pt;}
|
|||
|
div.Section1
|
|||
|
{page:Section1;}
|
|||
|
/* List Definitions */
|
|||
|
ol
|
|||
|
{margin-bottom:0in;}
|
|||
|
ul
|
|||
|
{margin-bottom:0in;}
|
|||
|
-->
|
|||
|
</style>
|
|||
|
|
|||
|
</head>
|
|||
|
|
|||
|
<body lang=RU>
|
|||
|
|
|||
|
<div class=Section1>
|
|||
|
|
|||
|
<h1><span lang=EN-US>Rapid Object Detection With A Cascade of Boosted
|
|||
|
Classifiers Based on Haar-like Features</span></h1>
|
|||
|
|
|||
|
<h2><span lang=EN-US>Introduction</span></h2>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>This document describes how to train and
|
|||
|
use a cascade of boosted classifiers for rapid object detection. A large set of
|
|||
|
over-complete haar-like features provide the basis for the simple individual
|
|||
|
classifiers. Examples of object detection tasks are face, eye and nose
|
|||
|
detection, as well as logo detection. </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The sample detection task in this document
|
|||
|
is logo detection, since logo detection does not require the collection of
|
|||
|
large set of registered and carefully marked object samples. Instead we assume
|
|||
|
that from one prototype image, a very large set of derived object examples can
|
|||
|
be derived (</span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
|
|||
|
lang=EN-US> utility, see below).</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>A detailed description of the training/evaluation
|
|||
|
algorithm can be found in [1] and [2].</span></p>
|
|||
|
|
|||
|
<h2><span lang=EN-US>Samples Creation</span></h2>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>For training a training samples must be
|
|||
|
collected. There are two sample types: negative samples and positive samples.
|
|||
|
Negative samples correspond to non-object images. Positive samples correspond
|
|||
|
to object images.</span></p>
|
|||
|
|
|||
|
<h3><span lang=EN-US>Negative Samples</span></h3>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Negative samples are taken from arbitrary
|
|||
|
images. These images must not contain object representations. Negative samples
|
|||
|
are passed through background description file. It is a text file in which each
|
|||
|
text line contains the filename (relative to the directory of the description
|
|||
|
file) of negative sample image. This file must be created manually. Note that
|
|||
|
the negative samples and sample images are also called background samples or
|
|||
|
background samples images, and are used interchangeably in this document</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Example of negative description file:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Directory structure:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>bg.txt</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span style='font-family:"Times New Roman";
|
|||
|
font-weight:normal'>File </span></span><span class=Typewch><span lang=EN-US>bg.txt:</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg</span></span></p>
|
|||
|
|
|||
|
<h3><span lang=EN-US>Positive Samples</span></h3>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Positive samples are created by </span><span
|
|||
|
class=Typewch><span lang=EN-US>createsamples</span></span><span lang=EN-US>
|
|||
|
utility. They may be created from single object image or from collection of
|
|||
|
previously marked up images.<br>
|
|||
|
<br>
|
|||
|
</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The single object image may for instance
|
|||
|
contain a company logo. Then are large set of positive samples are created from
|
|||
|
the given object image by randomly rotating, changing the logo color as well as
|
|||
|
placing the logo on arbitrary background.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The amount and range of randomness can be
|
|||
|
controlled by command line arguments. </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span
|
|||
|
lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>name of the
|
|||
|
output file containing the positive samples for training</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- img <image_file_name></span></span><span
|
|||
|
lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>source object
|
|||
|
image (e.g., a company logo)</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- bg <background_file_name></span></span><span
|
|||
|
lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>background
|
|||
|
description file; contains a list of images into which randomly distorted
|
|||
|
versions of the object are pasted for positive sample generation</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- num <number_of_samples></span></span><span
|
|||
|
lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>number of
|
|||
|
positive samples to generate </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- bgcolor <background_color></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
lang=EN-US><3E><><EFBFBD><EFBFBD><EFBFBD> background color (currently grayscale images are assumed); the
|
|||
|
background color denotes the transparent color. Since there might be
|
|||
|
compression artifacts, the amount of color tolerance can be specified by </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>bgthresh</span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:Arial;font-weight:normal'>. </span></span><span
|
|||
|
lang=EN-US>All pixels between </span><span class=Typewch><span lang=EN-US>bgcolor-bgthresh</span></span><span
|
|||
|
lang=EN-US> and </span><span class=Typewch><span lang=EN-US>bgcolor+bgthresh</span></span><span
|
|||
|
lang=EN-US> are regarded as transparent.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- bgthresh <background_color_threshold></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- inv</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
lang=EN-US><3E><><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- randinv</span></span><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
lang=EN-US><3E><><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted randomly</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxidev <max_intensity_deviation></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span lang=EN-US>maximal
|
|||
|
intensity deviation of foreground samples pixels</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxxangle <max_x_rotation_angle>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxyangle <max_y_rotation_angle>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxzangle <max_z_rotation_angle></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
lang=EN-US><3E><><EFBFBD><EFBFBD><EFBFBD> maximum rotation angles in radians</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>-show</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
lang=EN-US><3E><><EFBFBD><EFBFBD><EFBFBD> if specified, each sample will be shown. Pressing <20>Esc<73> will
|
|||
|
continue creation process without samples showing. Useful debugging option.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- w <sample_width></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>width (in
|
|||
|
pixels) of the output samples</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>height (in
|
|||
|
pixels) of the output samples</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>For following procedure is used to create a
|
|||
|
sample object instance:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The source image is rotated random around
|
|||
|
all three axes. The chosen angle is limited my</span><span class=Typewch><span
|
|||
|
lang=EN-US> -max?angle</span></span><span lang=EN-US>. Next pixels of
|
|||
|
intensities in the range of </span><span class=Typewch><span lang=EN-US>[bg_color-bg_color_threshold;
|
|||
|
bg_color+bg_color_threshold]</span></span><span lang=EN-US> are regarded as
|
|||
|
transparent. White noise is added to the intensities of the foreground. If </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>inv</span></span><span lang=EN-US> key is
|
|||
|
specified then foreground pixel intensities are inverted. If </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>randinv</span></span><span lang=EN-US> key is
|
|||
|
specified then it is randomly selected whether for this sample inversion will
|
|||
|
be applied. Finally, the obtained image is placed onto arbitrary background
|
|||
|
from the background description file, resized to the pixel size specified by </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span lang=EN-US> and </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span lang=EN-US> and stored
|
|||
|
into the file specified by the </span><span class=Typewch><span lang=EN-US><EFBFBD>vec</span></span><span
|
|||
|
lang=EN-US> command line parameter.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Positive samples also may be obtained from
|
|||
|
a collection of previously marked up images. This collection is described by
|
|||
|
text file similar to background description file. Each line of this file
|
|||
|
corresponds to collection image. The first element of the line is image file
|
|||
|
name. It is followed by number of object instances. The following numbers are
|
|||
|
the coordinates of bounding rectangles (x, y, width, height).</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Example of description file:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Directory structure:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>info.dat</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
|
|||
|
"Times New Roman";font-weight:normal'>File </span></span><span class=Typewch><span
|
|||
|
lang=EN-US>info.dat:</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg<70> 1<> 140
|
|||
|
100 45 45</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg<70> 2<> 100
|
|||
|
200 50 50<35><30> 50 30 25 25</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Image </span><span class=Typewch><span
|
|||
|
lang=EN-US>img1.jpg</span></span><span lang=EN-US> contains single object
|
|||
|
instance with bounding rectangle (140, 100, 45, 45). Image </span><span
|
|||
|
class=Typewch><span lang=EN-US>img2.jpg</span></span><span lang=EN-US> contains
|
|||
|
two object instances.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>In order to create positive samples from
|
|||
|
such collection </span><span class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span
|
|||
|
lang=EN-US> argument should be specified instead of </span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>img</span></span><span class=Typewch><span style='font-family:"Times New Roman";
|
|||
|
font-weight:normal'>:</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span
|
|||
|
lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>description file
|
|||
|
of marked up images collection</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The scheme of sample creation in this case
|
|||
|
is as follows. The object instances are taken from images. Then they are
|
|||
|
resized to samples size and stored in output file. No distortion is applied, so
|
|||
|
the only affecting arguments are </span><span class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span
|
|||
|
lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-h</span></span><span
|
|||
|
lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-show</span></span><span
|
|||
|
lang=EN-US> and </span><span class=Typewch><span lang=EN-US><EFBFBD>num</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>createsamples</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> utility may be used for examining samples stored in positive samples
|
|||
|
file. In order to do this only </span></span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>vec</span></span><span class=Typewch><span lang=EN-US
|
|||
|
style='font-family:"Times New Roman";font-weight:normal'>, </span></span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> and </span></span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> parameters
|
|||
|
should be specified.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Note that for training, it does not matter
|
|||
|
how positive samples files are generated. So the </span><span class=Typewch><span
|
|||
|
lang=EN-US>createsamples</span></span><span lang=EN-US> utility is only one way
|
|||
|
to collect/create a vector file of positive samples.</span></p>
|
|||
|
|
|||
|
<h2><span lang=EN-US>Training</span></h2>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The next step after samples creation is
|
|||
|
training of classifier. It is performed by the </span><span class=Typewch><span
|
|||
|
lang=EN-US>haartraining</span></span><span lang=EN-US> utility.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span
|
|||
|
class=Typewch><span lang=EN-US> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- data <dir_name></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> file name of positive sample file (created by </span></span><span
|
|||
|
class=Typewch><span lang=EN-US>trainingsamples</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> utility or by any other means)</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- bg <background_file_name></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> background description file</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- npos <number_of_positive_samples>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- nneg <number_of_negative_samples></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> number of positive/negative samples used in training of each
|
|||
|
classifier stage. Reasonable values are npos = 7000 and nneg = 3000.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- nstages <number_of_stages></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>number of
|
|||
|
stages to be trained</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- nsplits <number_of_splits></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> determines the weak classifier used in stage classifiers. If </span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>1</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>, then a simple stump classifier is used, if </span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>2</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> and more, then CART classifier with </span></span><span class=Typewch><span
|
|||
|
lang=EN-US>number_of_splits</span></span><span class=Typewch><span lang=EN-US
|
|||
|
style='font-family:"Times New Roman";font-weight:normal'> internal (split)
|
|||
|
nodes is used</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- mem <memory_in_MB></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> Available memory in MB for precalculation. The more memory you
|
|||
|
have the faster the training process</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- sym (default),</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- nonsym</span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> specifies whether the object class under training has vertical
|
|||
|
symmetry or not. Vertical symmetry speeds up training process. For instance,
|
|||
|
frontal faces show off vertical symmetry</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- minhitrate <min_hit_rate></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> minimal desired hit rate for each stage classifier. Overall hit
|
|||
|
rate may be estimated as </span></span><span class=Typewch><span lang=EN-US>(min_hit_rate^number_of_stages)</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxfalsealarm <max_false_alarm_rate></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> maximal desired false alarm rate for each stage classifier. </span></span><span
|
|||
|
class=Typewch><span style='font-family:"Times New Roman";font-weight:normal'>Overall
|
|||
|
false alarm rate may be estimated as</span></span><span class=Typewch><span
|
|||
|
lang=EN-US> (max_false_alarm_rate^number_of_stages)</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- weighttrimming <weight_trimming></span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
|
|||
|
lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>Specifies
|
|||
|
wheter and how much weight trimming should be used. A decent choice is 0.90.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- eqw</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- mode <BASIC (default) | CORE | ALL></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> selects the type of haar features set used in training. BASIC use
|
|||
|
only upright features, while ALL uses the full set of upright and 45 degree
|
|||
|
rotated feature set. See [1] for more details.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same
|
|||
|
values as used during training samples creation (utility </span></span><span
|
|||
|
class=Typewch><span lang=EN-US>trainingsamples</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>)</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
|
|||
|
"Times New Roman";font-weight:normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
|
|||
|
"Times New Roman";font-weight:normal'>Note: in order to use multiprocessor
|
|||
|
advantage a compiler that supports OpenMP 1.0 standard should be used.</span></span></p>
|
|||
|
|
|||
|
<h2><span lang=EN-US>Application</span></h2>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>OpenCV cvHaarDetectObjects() function (in
|
|||
|
particular haarFaceDetect demo) is used for detection.</span></p>
|
|||
|
|
|||
|
<h3><span lang=EN-US>Test Samples</span></h3>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of
|
|||
|
trained classifier a collection of marked up images is needed. When such
|
|||
|
collection is not available test samples may be created from single object
|
|||
|
image by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
|
|||
|
lang=EN-US> utility. The scheme of test samples creation in this case is
|
|||
|
similar to training samples creation since each test sample is a background
|
|||
|
image into which a randomly distorted and randomly scaled instance of the
|
|||
|
object picture is pasted at a random position. </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>If both </span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>img</span></span><span lang=EN-US> and </span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>info</span></span><span lang=EN-US> arguments are specified then
|
|||
|
test samples will be created by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
|
|||
|
lang=EN-US> utility. The sample image is arbitrary distorted as it was
|
|||
|
described below, then it is placed at random location to background image and
|
|||
|
stored. The corresponding description line is added to the file specified by </span><span
|
|||
|
class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span lang=EN-US> argument.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The </span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>w</span></span><span lang=EN-US> and </span><span class=Typewch><span
|
|||
|
lang=EN-US><3E>h</span></span><span lang=EN-US> keys determine the minimal size of
|
|||
|
placed object picture.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>The test image file name format is as
|
|||
|
follows:</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US>imageOrderNumber_x_y_width_height.jpg</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>, where </span></span><span class=Typewch><span lang=EN-US>x</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>, </span></span><span class=Typewch><span lang=EN-US>y</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>, </span></span><span class=Typewch><span lang=EN-US>width</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> and </span></span><span class=Typewch><span lang=EN-US>height</span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> are the coordinates of placed object bounding rectangle.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
|
|||
|
"Times New Roman";font-weight:normal'>Note that you should use a background
|
|||
|
images set different from the background image set used during training.</span></span></p>
|
|||
|
|
|||
|
<h3><span class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>Performance
|
|||
|
Evaluation</span></span></h3>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of the
|
|||
|
classifier </span><span class=Typewch><span lang=EN-US>performance</span></span><span
|
|||
|
lang=EN-US> utility may be used. It takes a collection of marked up images,
|
|||
|
applies the classifier and outputs the performance, i.e. number of found
|
|||
|
objects, number of missed objects, number of false alarms and other
|
|||
|
information.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US> </span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span
|
|||
|
class=Typewch><span lang=EN-US> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- data <dir_name></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> file with test samples description</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxSizeDiff <max_size_difference></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- maxPosDiff <max_position_difference></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> determine the criterion of reference and detected rectangles
|
|||
|
coincidence. Default values are 1.5 and 0.3 respectively.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- sf <scale_factor></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> detection parameter. Default value is 1.2.</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US>- h <sample_height></span></span><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'> </span></span></p>
|
|||
|
|
|||
|
<p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
|
|||
|
class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
|
|||
|
normal'><3E><><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same
|
|||
|
values as used during training (utility </span></span><span class=Typewch><span
|
|||
|
lang=EN-US>haartraining</span></span><span class=Typewch><span lang=EN-US
|
|||
|
style='font-family:"Times New Roman";font-weight:normal'>)</span></span></p>
|
|||
|
|
|||
|
<h2><span lang=EN-US>References</span></h2>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>[1] Rainer Lienhart and Jochen Maydt. An
|
|||
|
Extended Set of Haar-like Features for Rapid Object Detection. Submitted to
|
|||
|
ICIP2002.</span></p>
|
|||
|
|
|||
|
<p class=MsoNormal><span lang=EN-US>[2] Alexander Kuranov, Rainer Lienhart, and
|
|||
|
Vadim Pisarevsky. An Empirical Analysis of Boosting Algorithms for Rapid
|
|||
|
Objects With an Extended Set of Haar-like Features. Intel Technical Report
|
|||
|
MRL-TR-July02-01, 2002.</span></p>
|
|||
|
|
|||
|
</div>
|
|||
|
|
|||
|
</body>
|
|||
|
|
|||
|
</html>
|