mirror of
https://github.com/opencv/opencv.git
synced 2025-06-07 17:44:04 +08:00
Merge pull request #16919 from alalek:backport_16860
This commit is contained in:
commit
222a48577f
@ -51,12 +51,136 @@
|
||||
/**
|
||||
@defgroup calib3d Camera Calibration and 3D Reconstruction
|
||||
|
||||
The functions in this section use a so-called pinhole camera model. In this model, a scene view is
|
||||
formed by projecting 3D points into the image plane using a perspective transformation.
|
||||
The functions in this section use a so-called pinhole camera model. The view of a scene
|
||||
is obtained by projecting a scene's 3D point \f$P_w\f$ into the image plane using a perspective
|
||||
transformation which forms the corresponding pixel \f$p\f$. Both \f$P_w\f$ and \f$p\f$ are
|
||||
represented in homogeneous coordinates, i.e. as 3D and 2D homogeneous vector respectively. You will
|
||||
find a brief introduction to projective geometry, homogeneous vectors and homogeneous
|
||||
transformations at the end of this section's introduction. For more succinct notation, we often drop
|
||||
the 'homogeneous' and say vector instead of homogeneous vector.
|
||||
|
||||
\f[s \; m' = A [R|t] M'\f]
|
||||
The distortion-free projective transformation given by a pinhole camera model is shown below.
|
||||
|
||||
or
|
||||
\f[s \; p = A \begin{bmatrix} R|t \end{bmatrix} P_w,\f]
|
||||
|
||||
where \f$P_w\f$ is a 3D point expressed with respect to the world coordinate system,
|
||||
\f$p\f$ is a 2D pixel in the image plane, \f$A\f$ is the intrinsic camera matrix,
|
||||
\f$R\f$ and \f$t\f$ are the rotation and translation that describe the change of coordinates from
|
||||
world to camera coordinate systems (or camera frame) and \f$s\f$ is the projective transformation's
|
||||
arbitrary scaling and not part of the camera model.
|
||||
|
||||
The intrinsic camera matrix \f$A\f$ (notation used as in @cite Zhang2000 and also generally notated
|
||||
as \f$K\f$) projects 3D points given in the camera coordinate system to 2D pixel coordinates, i.e.
|
||||
|
||||
\f[p = A P_c.\f]
|
||||
|
||||
The camera matrix \f$A\f$ is composed of the focal lengths \f$f_x\f$ and \f$f_y\f$, which are
|
||||
expressed in pixel units, and the principal point \f$(c_x, c_y)\f$, that is usually close to the
|
||||
image center:
|
||||
|
||||
\f[A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1},\f]
|
||||
|
||||
and thus
|
||||
|
||||
\f[s \vecthree{u}{v}{1} = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1} \vecthree{X_c}{Y_c}{Z_c}.\f]
|
||||
|
||||
The matrix of intrinsic parameters does not depend on the scene viewed. So, once estimated, it can
|
||||
be re-used as long as the focal length is fixed (in case of a zoom lens). Thus, if an image from the
|
||||
camera is scaled by a factor, all of these parameters need to be scaled (multiplied/divided,
|
||||
respectively) by the same factor.
|
||||
|
||||
The joint rotation-translation matrix \f$[R|t]\f$ is the matrix product of a projective
|
||||
transformation and a homogeneous transformation. The 3-by-4 projective transformation maps 3D points
|
||||
represented in camera coordinates to 2D poins in the image plane and represented in normalized
|
||||
camera coordinates \f$x' = X_c / Z_c\f$ and \f$y' = Y_c / Z_c\f$:
|
||||
|
||||
\f[Z_c \begin{bmatrix}
|
||||
x' \\
|
||||
y' \\
|
||||
1
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
1 & 0 & 0 & 0 \\
|
||||
0 & 1 & 0 & 0 \\
|
||||
0 & 0 & 1 & 0
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
X_c \\
|
||||
Y_c \\
|
||||
Z_c \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
The homogeneous transformation is encoded by the extrinsic parameters \f$R\f$ and \f$t\f$ and
|
||||
represents the change of basis from world coordinate system \f$w\f$ to the camera coordinate sytem
|
||||
\f$c\f$. Thus, given the representation of the point \f$P\f$ in world coordinates, \f$P_w\f$, we
|
||||
obtain \f$P\f$'s representation in the camera coordinate system, \f$P_c\f$, by
|
||||
|
||||
\f[P_c = \begin{bmatrix}
|
||||
R & t \\
|
||||
0 & 1
|
||||
\end{bmatrix} P_w,\f]
|
||||
|
||||
This homogeneous transformation is composed out of \f$R\f$, a 3-by-3 rotation matrix, and \f$t\f$, a
|
||||
3-by-1 translation vector:
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
R & t \\
|
||||
0 & 1
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
r_{11} & r_{12} & r_{13} & t_x \\
|
||||
r_{21} & r_{22} & r_{23} & t_y \\
|
||||
r_{31} & r_{32} & r_{33} & t_z \\
|
||||
0 & 0 & 0 & 1
|
||||
\end{bmatrix},
|
||||
\f]
|
||||
|
||||
and therefore
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
X_c \\
|
||||
Y_c \\
|
||||
Z_c \\
|
||||
1
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
r_{11} & r_{12} & r_{13} & t_x \\
|
||||
r_{21} & r_{22} & r_{23} & t_y \\
|
||||
r_{31} & r_{32} & r_{33} & t_z \\
|
||||
0 & 0 & 0 & 1
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
X_w \\
|
||||
Y_w \\
|
||||
Z_w \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
Combining the projective transformation and the homogeneous transformation, we obtain the projective
|
||||
transformation that maps 3D points in world coordinates into 2D points in the image plane and in
|
||||
normalized camera coordinates:
|
||||
|
||||
\f[Z_c \begin{bmatrix}
|
||||
x' \\
|
||||
y' \\
|
||||
1
|
||||
\end{bmatrix} = \begin{bmatrix} R|t \end{bmatrix} \begin{bmatrix}
|
||||
X_w \\
|
||||
Y_w \\
|
||||
Z_w \\
|
||||
1
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
r_{11} & r_{12} & r_{13} & t_x \\
|
||||
r_{21} & r_{22} & r_{23} & t_y \\
|
||||
r_{31} & r_{32} & r_{33} & t_z
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
X_w \\
|
||||
Y_w \\
|
||||
Z_w \\
|
||||
1
|
||||
\end{bmatrix},\f]
|
||||
|
||||
with \f$x' = X_c / Z_c\f$ and \f$y' = Y_c / Z_c\f$. Putting the equations for instrincs and extrinsics together, we can write out
|
||||
\f$s \; p = A \begin{bmatrix} R|t \end{bmatrix} P_w\f$ as
|
||||
|
||||
\f[s \vecthree{u}{v}{1} = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}
|
||||
\begin{bmatrix}
|
||||
@ -69,62 +193,81 @@ X_w \\
|
||||
Y_w \\
|
||||
Z_w \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
If \f$Z_c \ne 0\f$, the transformation above is equivalent to the following,
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
u \\
|
||||
v
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
f_x X_c/Z_c + c_x \\
|
||||
f_y Y_c/Z_c + c_y
|
||||
\end{bmatrix}\f]
|
||||
|
||||
where:
|
||||
with
|
||||
|
||||
- \f$(X_w, Y_w, Z_w)\f$ are the coordinates of a 3D point in the world coordinate space
|
||||
- \f$(u, v)\f$ are the coordinates of the projection point in pixels
|
||||
- \f$A\f$ is a camera matrix, or a matrix of intrinsic parameters
|
||||
- \f$(c_x, c_y)\f$ is a principal point that is usually at the image center
|
||||
- \f$f_x, f_y\f$ are the focal lengths expressed in pixel units.
|
||||
|
||||
Thus, if an image from the camera is scaled by a factor, all of these parameters should be scaled
|
||||
(multiplied/divided, respectively) by the same factor. The matrix of intrinsic parameters does not
|
||||
depend on the scene viewed. So, once estimated, it can be re-used as long as the focal length is
|
||||
fixed (in case of zoom lens). The joint rotation-translation matrix \f$[R|t]\f$ is called a matrix of
|
||||
extrinsic parameters. It is used to describe the camera motion around a static scene, or vice versa,
|
||||
rigid motion of an object in front of a still camera. That is, \f$[R|t]\f$ translates coordinates of a
|
||||
world point \f$(X_w, Y_w, Z_w)\f$ to a coordinate system, fixed with respect to the camera.
|
||||
The transformation above is equivalent to the following (when \f$z \ne 0\f$ ):
|
||||
|
||||
\f[\begin{array}{l}
|
||||
\vecthree{X_c}{Y_c}{Z_c} = R \vecthree{X_w}{Y_w}{Z_w} + t \\
|
||||
x' = X_c/Z_c \\
|
||||
y' = Y_c/Z_c \\
|
||||
u = f_x \times x' + c_x \\
|
||||
v = f_y \times y' + c_y
|
||||
\end{array}\f]
|
||||
\f[\vecthree{X_c}{Y_c}{Z_c} = \begin{bmatrix}
|
||||
R|t
|
||||
\end{bmatrix} \begin{bmatrix}
|
||||
X_w \\
|
||||
Y_w \\
|
||||
Z_w \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
The following figure illustrates the pinhole camera model.
|
||||
|
||||

|
||||
|
||||
Real lenses usually have some distortion, mostly radial distortion and slight tangential distortion.
|
||||
Real lenses usually have some distortion, mostly radial distortion, and slight tangential distortion.
|
||||
So, the above model is extended as:
|
||||
|
||||
\f[\begin{array}{l}
|
||||
\vecthree{X_c}{Y_c}{Z_c} = R \vecthree{X_w}{Y_w}{Z_w} + t \\
|
||||
x' = X_c/Z_c \\
|
||||
y' = Y_c/Z_c \\
|
||||
x'' = x' \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + 2 p_1 x' y' + p_2(r^2 + 2 x'^2) + s_1 r^2 + s_2 r^4 \\
|
||||
y'' = y' \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' + s_3 r^2 + s_4 r^4 \\
|
||||
\text{where} \quad r^2 = x'^2 + y'^2 \\
|
||||
u = f_x \times x'' + c_x \\
|
||||
v = f_y \times y'' + c_y
|
||||
\end{array}\f]
|
||||
\f[\begin{bmatrix}
|
||||
u \\
|
||||
v
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
f_x x'' + c_x \\
|
||||
f_y y'' + c_y
|
||||
\end{bmatrix}\f]
|
||||
|
||||
\f$k_1\f$, \f$k_2\f$, \f$k_3\f$, \f$k_4\f$, \f$k_5\f$, and \f$k_6\f$ are radial distortion coefficients. \f$p_1\f$ and \f$p_2\f$ are
|
||||
tangential distortion coefficients. \f$s_1\f$, \f$s_2\f$, \f$s_3\f$, and \f$s_4\f$, are the thin prism distortion
|
||||
coefficients. Higher-order coefficients are not considered in OpenCV.
|
||||
where
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
x'' \\
|
||||
y''
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
x' \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + 2 p_1 x' y' + p_2(r^2 + 2 x'^2) + s_1 r^2 + s_2 r^4 \\
|
||||
y' \frac{1 + k_1 r^2 + k_2 r^4 + k_3 r^6}{1 + k_4 r^2 + k_5 r^4 + k_6 r^6} + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' + s_3 r^2 + s_4 r^4 \\
|
||||
\end{bmatrix}\f]
|
||||
|
||||
with
|
||||
|
||||
\f[r^2 = x'^2 + y'^2\f]
|
||||
|
||||
and
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
x'\\
|
||||
y'
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
X_c/Z_c \\
|
||||
Y_c/Z_c
|
||||
\end{bmatrix},\f]
|
||||
|
||||
if \f$Z_c \ne 0\f$.
|
||||
|
||||
The distortion parameters are the radial coefficients \f$k_1\f$, \f$k_2\f$, \f$k_3\f$, \f$k_4\f$, \f$k_5\f$, and \f$k_6\f$
|
||||
,\f$p_1\f$ and \f$p_2\f$ are the tangential distortion coefficients, and \f$s_1\f$, \f$s_2\f$, \f$s_3\f$, and \f$s_4\f$,
|
||||
are the thin prism distortion coefficients. Higher-order coefficients are not considered in OpenCV.
|
||||
|
||||
The next figures show two common types of radial distortion: barrel distortion
|
||||
(\f$ 1 + k_1 r^2 + k_2 r^4 + k_3 r^6 \f$ monotonically decreasing)
|
||||
and pincushion distortion (\f$ 1 + k_1 r^2 + k_2 r^4 + k_3 r^6 \f$ monotonically increasing).
|
||||
Radial distortion is always monotonic for real lenses,
|
||||
and if the estimator produces a non monotonic result,
|
||||
and if the estimator produces a non-monotonic result,
|
||||
this should be considered a calibration failure.
|
||||
More generally, radial distortion must be monotonic and the distortion function, must be bijective.
|
||||
More generally, radial distortion must be monotonic and the distortion function must be bijective.
|
||||
A failed estimation result may look deceptively good near the image center
|
||||
but will work poorly in e.g. AR/SFM applications.
|
||||
The optimization method used in OpenCV camera calibration does not include these constraints as
|
||||
@ -134,22 +277,28 @@ See [issue #15992](https://github.com/opencv/opencv/issues/15992) for additional
|
||||

|
||||

|
||||
|
||||
In some cases the image sensor may be tilted in order to focus an oblique plane in front of the
|
||||
In some cases, the image sensor may be tilted in order to focus an oblique plane in front of the
|
||||
camera (Scheimpflug principle). This can be useful for particle image velocimetry (PIV) or
|
||||
triangulation with a laser fan. The tilt causes a perspective distortion of \f$x''\f$ and
|
||||
\f$y''\f$. This distortion can be modelled in the following way, see e.g. @cite Louhichi07.
|
||||
\f$y''\f$. This distortion can be modeled in the following way, see e.g. @cite Louhichi07.
|
||||
|
||||
\f[\begin{array}{l}
|
||||
s\vecthree{x'''}{y'''}{1} =
|
||||
\f[\begin{bmatrix}
|
||||
u \\
|
||||
v
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
f_x x''' + c_x \\
|
||||
f_y y''' + c_y
|
||||
\end{bmatrix},\f]
|
||||
|
||||
where
|
||||
|
||||
\f[s\vecthree{x'''}{y'''}{1} =
|
||||
\vecthreethree{R_{33}(\tau_x, \tau_y)}{0}{-R_{13}(\tau_x, \tau_y)}
|
||||
{0}{R_{33}(\tau_x, \tau_y)}{-R_{23}(\tau_x, \tau_y)}
|
||||
{0}{0}{1} R(\tau_x, \tau_y) \vecthree{x''}{y''}{1}\\
|
||||
u = f_x \times x''' + c_x \\
|
||||
v = f_y \times y''' + c_y
|
||||
\end{array}\f]
|
||||
{0}{0}{1} R(\tau_x, \tau_y) \vecthree{x''}{y''}{1}\f]
|
||||
|
||||
where the matrix \f$R(\tau_x, \tau_y)\f$ is defined by two rotations with angular parameter \f$\tau_x\f$
|
||||
and \f$\tau_y\f$, respectively,
|
||||
and the matrix \f$R(\tau_x, \tau_y)\f$ is defined by two rotations with angular parameter
|
||||
\f$\tau_x\f$ and \f$\tau_y\f$, respectively,
|
||||
|
||||
\f[
|
||||
R(\tau_x, \tau_y) =
|
||||
@ -168,8 +317,8 @@ vector. That is, if the vector contains four elements, it means that \f$k_3=0\f$
|
||||
coefficients do not depend on the scene viewed. Thus, they also belong to the intrinsic camera
|
||||
parameters. And they remain the same regardless of the captured image resolution. If, for example, a
|
||||
camera has been calibrated on images of 320 x 240 resolution, absolutely the same distortion
|
||||
coefficients can be used for 640 x 480 images from the same camera while \f$f_x\f$, \f$f_y\f$, \f$c_x\f$, and
|
||||
\f$c_y\f$ need to be scaled appropriately.
|
||||
coefficients can be used for 640 x 480 images from the same camera while \f$f_x\f$, \f$f_y\f$,
|
||||
\f$c_x\f$, and \f$c_y\f$ need to be scaled appropriately.
|
||||
|
||||
The functions below use the above model to do the following:
|
||||
|
||||
@ -181,8 +330,63 @@ pattern (every view is described by several 3D-2D point correspondences).
|
||||
- Estimate the relative position and orientation of the stereo camera "heads" and compute the
|
||||
*rectification* transformation that makes the camera optical axes parallel.
|
||||
|
||||
<B> Homogeneous Coordinates </B><br>
|
||||
Homogeneous Coordinates are a system of coordinates that are used in projective geometry. Their use
|
||||
allows to represent points at infinity by finite coordinates and simplifies formulas when compared
|
||||
to the cartesian counterparts, e.g. they have the advantage that affine transformations can be
|
||||
expressed as linear homogeneous transformation.
|
||||
|
||||
One obtains the homogeneous vector \f$P_h\f$ by appending a 1 along an n-dimensional cartesian
|
||||
vector \f$P\f$ e.g. for a 3D cartesian vector the mapping \f$P \rightarrow P_h\f$ is:
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
X \\
|
||||
Y \\
|
||||
Z
|
||||
\end{bmatrix} \rightarrow \begin{bmatrix}
|
||||
X \\
|
||||
Y \\
|
||||
Z \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
For the inverse mapping \f$P_h \rightarrow P\f$, one divides all elements of the homogeneous vector
|
||||
by its last element, e.g. for a 3D homogeneous vector one gets its 2D cartesian counterpart by:
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
X \\
|
||||
Y \\
|
||||
W
|
||||
\end{bmatrix} \rightarrow \begin{bmatrix}
|
||||
X / W \\
|
||||
Y / W
|
||||
\end{bmatrix},\f]
|
||||
|
||||
if \f$W \ne 0\f$.
|
||||
|
||||
Due to this mapping, all multiples \f$k P_h\f$, for \f$k \ne 0\f$, of a homogeneous point represent
|
||||
the same point \f$P_h\f$. An intuitive understanding of this property is that under a projective
|
||||
transformation, all multiples of \f$P_h\f$ are mapped to the same point. This is the physical
|
||||
observation one does for pinhole cameras, as all points along a ray through the camera's pinhole are
|
||||
projected to the same image point, e.g. all points along the red ray in the image of the pinhole
|
||||
camera model above would be mapped to the same image coordinate. This property is also the source
|
||||
for the scale ambiguity s in the equation of the pinhole camera model.
|
||||
|
||||
As mentioned, by using homogeneous coordinates we can express any change of basis parameterized by
|
||||
\f$R\f$ and \f$t\f$ as a linear transformation, e.g. for the change of basis from coordinate system
|
||||
0 to coordinate system 1 becomes:
|
||||
|
||||
\f[P_1 = R P_0 + t \rightarrow P_{h_1} = \begin{bmatrix}
|
||||
R & t \\
|
||||
0 & 1
|
||||
\end{bmatrix} P_{h_0}.\f]
|
||||
|
||||
@note
|
||||
- A calibration sample for 3 cameras in horizontal position can be found at
|
||||
- Many functions in this module take a camera matrix as an input parameter. Although all
|
||||
functions assume the same structure of this parameter, they may name it differently. The
|
||||
parameter's description, however, will be clear in that a camera matrix with the structure
|
||||
shown above is required.
|
||||
- A calibration sample for 3 cameras in a horizontal position can be found at
|
||||
opencv_source_code/samples/cpp/3calibration.cpp
|
||||
- A calibration sample based on a sequence of images can be found at
|
||||
opencv_source_code/samples/cpp/calibration.cpp
|
||||
@ -527,10 +731,11 @@ CV_EXPORTS_W void composeRT( InputArray rvec1, InputArray tvec1,
|
||||
|
||||
/** @brief Projects 3D points to an image plane.
|
||||
|
||||
@param objectPoints Array of object points, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel (or
|
||||
vector\<Point3f\> ), where N is the number of points in the view.
|
||||
@param rvec Rotation vector. See Rodrigues for details.
|
||||
@param tvec Translation vector.
|
||||
@param objectPoints Array of object points expressed wrt. the world coordinate frame. A 3xN/Nx3
|
||||
1-channel or 1xN/Nx1 3-channel (or vector\<Point3f\> ), where N is the number of points in the view.
|
||||
@param rvec The rotation vector (@ref Rodrigues) that, together with tvec, performs a change of
|
||||
basis from world to camera coordinate system, see @ref calibrateCamera for details.
|
||||
@param tvec The translation vector, see parameter description above.
|
||||
@param cameraMatrix Camera matrix \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{_1}\f$ .
|
||||
@param distCoeffs Input vector of distortion coefficients
|
||||
\f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6 [, s_1, s_2, s_3, s_4[, \tau_x, \tau_y]]]])\f$ of
|
||||
@ -542,20 +747,21 @@ points with respect to components of the rotation vector, translation vector, fo
|
||||
coordinates of the principal point and the distortion coefficients. In the old interface different
|
||||
components of the jacobian are returned via different output parameters.
|
||||
@param aspectRatio Optional "fixed aspect ratio" parameter. If the parameter is not 0, the
|
||||
function assumes that the aspect ratio (*fx/fy*) is fixed and correspondingly adjusts the jacobian
|
||||
matrix.
|
||||
function assumes that the aspect ratio (\f$f_x / f_y\f$) is fixed and correspondingly adjusts the
|
||||
jacobian matrix.
|
||||
|
||||
The function computes projections of 3D points to the image plane given intrinsic and extrinsic
|
||||
camera parameters. Optionally, the function computes Jacobians - matrices of partial derivatives of
|
||||
image points coordinates (as functions of all the input parameters) with respect to the particular
|
||||
parameters, intrinsic and/or extrinsic. The Jacobians are used during the global optimization in
|
||||
calibrateCamera, solvePnP, and stereoCalibrate . The function itself can also be used to compute a
|
||||
re-projection error given the current intrinsic and extrinsic parameters.
|
||||
The function computes the 2D projections of 3D points to the image plane, given intrinsic and
|
||||
extrinsic camera parameters. Optionally, the function computes Jacobians -matrices of partial
|
||||
derivatives of image points coordinates (as functions of all the input parameters) with respect to
|
||||
the particular parameters, intrinsic and/or extrinsic. The Jacobians are used during the global
|
||||
optimization in @ref calibrateCamera, @ref solvePnP, and @ref stereoCalibrate. The function itself
|
||||
can also be used to compute a re-projection error, given the current intrinsic and extrinsic
|
||||
parameters.
|
||||
|
||||
@note By setting rvec=tvec=(0,0,0) or by setting cameraMatrix to a 3x3 identity matrix, or by
|
||||
passing zero distortion coefficients, you can get various useful partial cases of the function. This
|
||||
means that you can compute the distorted coordinates for a sparse set of points or apply a
|
||||
perspective transformation (and also compute the derivatives) in the ideal zero-distortion setup.
|
||||
@note By setting rvec = tvec = \f$[0, 0, 0]\f$, or by setting cameraMatrix to a 3x3 identity matrix,
|
||||
or by passing zero distortion coefficients, one can get various useful partial cases of the
|
||||
function. This means, one can compute the distorted coordinates for a sparse set of points or apply
|
||||
a perspective transformation (and also compute the derivatives) in the ideal zero-distortion setup.
|
||||
*/
|
||||
CV_EXPORTS_W void projectPoints( InputArray objectPoints,
|
||||
InputArray rvec, InputArray tvec,
|
||||
@ -1280,44 +1486,48 @@ CV_EXPORTS_W bool findCirclesGrid( InputArray image, Size patternSize,
|
||||
OutputArray centers, int flags = CALIB_CB_SYMMETRIC_GRID,
|
||||
const Ptr<FeatureDetector> &blobDetector = SimpleBlobDetector::create());
|
||||
|
||||
/** @brief Finds the camera intrinsic and extrinsic parameters from several views of a calibration pattern.
|
||||
/** @brief Finds the camera intrinsic and extrinsic parameters from several views of a calibration
|
||||
pattern.
|
||||
|
||||
@param objectPoints In the new interface it is a vector of vectors of calibration pattern points in
|
||||
the calibration pattern coordinate space (e.g. std::vector<std::vector<cv::Vec3f>>). The outer
|
||||
vector contains as many elements as the number of the pattern views. If the same calibration pattern
|
||||
vector contains as many elements as the number of pattern views. If the same calibration pattern
|
||||
is shown in each view and it is fully visible, all the vectors will be the same. Although, it is
|
||||
possible to use partially occluded patterns, or even different patterns in different views. Then,
|
||||
the vectors will be different. The points are 3D, but since they are in a pattern coordinate system,
|
||||
then, if the rig is planar, it may make sense to put the model to a XY coordinate plane so that
|
||||
Z-coordinate of each input object point is 0.
|
||||
possible to use partially occluded patterns or even different patterns in different views. Then,
|
||||
the vectors will be different. Although the points are 3D, they all lie in the calibration pattern's
|
||||
XY coordinate plane (thus 0 in the Z-coordinate), if the used calibration pattern is a planar rig.
|
||||
In the old interface all the vectors of object points from different views are concatenated
|
||||
together.
|
||||
@param imagePoints In the new interface it is a vector of vectors of the projections of calibration
|
||||
pattern points (e.g. std::vector<std::vector<cv::Vec2f>>). imagePoints.size() and
|
||||
objectPoints.size() and imagePoints[i].size() must be equal to objectPoints[i].size() for each i.
|
||||
In the old interface all the vectors of object points from different views are concatenated
|
||||
together.
|
||||
objectPoints.size(), and imagePoints[i].size() and objectPoints[i].size() for each i, must be equal,
|
||||
respectively. In the old interface all the vectors of object points from different views are
|
||||
concatenated together.
|
||||
@param imageSize Size of the image used only to initialize the intrinsic camera matrix.
|
||||
@param cameraMatrix Output 3x3 floating-point camera matrix
|
||||
@param cameraMatrix Input/output 3x3 floating-point camera matrix
|
||||
\f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ . If CV\_CALIB\_USE\_INTRINSIC\_GUESS
|
||||
and/or CALIB_FIX_ASPECT_RATIO are specified, some or all of fx, fy, cx, cy must be
|
||||
initialized before calling the function.
|
||||
@param distCoeffs Output vector of distortion coefficients
|
||||
@param distCoeffs Input/output vector of distortion coefficients
|
||||
\f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6 [, s_1, s_2, s_3, s_4[, \tau_x, \tau_y]]]])\f$ of
|
||||
4, 5, 8, 12 or 14 elements.
|
||||
@param rvecs Output vector of rotation vectors (see Rodrigues ) estimated for each pattern view
|
||||
(e.g. std::vector<cv::Mat>>). That is, each k-th rotation vector together with the corresponding
|
||||
k-th translation vector (see the next output parameter description) brings the calibration pattern
|
||||
from the model coordinate space (in which object points are specified) to the world coordinate
|
||||
space, that is, a real position of the calibration pattern in the k-th pattern view (k=0.. *M* -1).
|
||||
@param tvecs Output vector of translation vectors estimated for each pattern view.
|
||||
@param stdDeviationsIntrinsics Output vector of standard deviations estimated for intrinsic parameters.
|
||||
Order of deviations values:
|
||||
@param rvecs Output vector of rotation vectors (@ref Rodrigues ) estimated for each pattern view
|
||||
(e.g. std::vector<cv::Mat>>). That is, each i-th rotation vector together with the corresponding
|
||||
i-th translation vector (see the next output parameter description) brings the calibration pattern
|
||||
from the object coordinate space (in which object points are specified) to the camera coordinate
|
||||
space. In more technical terms, the tuple of the i-th rotation and translation vector performs
|
||||
a change of basis from object coordinate space to camera coordinate space. Due to its duality, this
|
||||
tuple is equivalent to the position of the calibration pattern with respect to the camera coordinate
|
||||
space.
|
||||
@param tvecs Output vector of translation vectors estimated for each pattern view, see parameter
|
||||
describtion above.
|
||||
@param stdDeviationsIntrinsics Output vector of standard deviations estimated for intrinsic
|
||||
parameters. Order of deviations values:
|
||||
\f$(f_x, f_y, c_x, c_y, k_1, k_2, p_1, p_2, k_3, k_4, k_5, k_6 , s_1, s_2, s_3,
|
||||
s_4, \tau_x, \tau_y)\f$ If one of parameters is not estimated, it's deviation is equals to zero.
|
||||
@param stdDeviationsExtrinsics Output vector of standard deviations estimated for extrinsic parameters.
|
||||
Order of deviations values: \f$(R_1, T_1, \dotsc , R_M, T_M)\f$ where M is number of pattern views,
|
||||
\f$R_i, T_i\f$ are concatenated 1x3 vectors.
|
||||
@param stdDeviationsExtrinsics Output vector of standard deviations estimated for extrinsic
|
||||
parameters. Order of deviations values: \f$(R_0, T_0, \dotsc , R_{M - 1}, T_{M - 1})\f$ where M is
|
||||
the number of pattern views. \f$R_i, T_i\f$ are concatenated 1x3 vectors.
|
||||
@param perViewErrors Output vector of the RMS re-projection error estimated for each pattern view.
|
||||
@param flags Different flags that may be zero or a combination of the following values:
|
||||
- **CALIB_USE_INTRINSIC_GUESS** cameraMatrix contains valid initial values of
|
||||
@ -1328,7 +1538,7 @@ estimate extrinsic parameters. Use solvePnP instead.
|
||||
- **CALIB_FIX_PRINCIPAL_POINT** The principal point is not changed during the global
|
||||
optimization. It stays at the center or at a different location specified when
|
||||
CALIB_USE_INTRINSIC_GUESS is set too.
|
||||
- **CALIB_FIX_ASPECT_RATIO** The functions considers only fy as a free parameter. The
|
||||
- **CALIB_FIX_ASPECT_RATIO** The functions consider only fy as a free parameter. The
|
||||
ratio fx/fy stays the same as in the input cameraMatrix . When
|
||||
CALIB_USE_INTRINSIC_GUESS is not set, the actual input values of fx and fy are
|
||||
ignored, only their ratio is computed and used further.
|
||||
@ -1362,10 +1572,10 @@ supplied distCoeffs matrix is used. Otherwise, it is set to 0.
|
||||
The function estimates the intrinsic camera parameters and extrinsic parameters for each of the
|
||||
views. The algorithm is based on @cite Zhang2000 and @cite BouguetMCT . The coordinates of 3D object
|
||||
points and their corresponding 2D projections in each view must be specified. That may be achieved
|
||||
by using an object with a known geometry and easily detectable feature points. Such an object is
|
||||
by using an object with known geometry and easily detectable feature points. Such an object is
|
||||
called a calibration rig or calibration pattern, and OpenCV has built-in support for a chessboard as
|
||||
a calibration rig (see findChessboardCorners ). Currently, initialization of intrinsic parameters
|
||||
(when CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration
|
||||
a calibration rig (see @ref findChessboardCorners). Currently, initialization of intrinsic
|
||||
parameters (when CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration
|
||||
patterns (where Z-coordinates of the object points must be all zeros). 3D calibration rigs can also
|
||||
be used as long as initial cameraMatrix is provided.
|
||||
|
||||
@ -1384,11 +1594,11 @@ The algorithm performs the following steps:
|
||||
objectPoints. See projectPoints for details.
|
||||
|
||||
@note
|
||||
If you use a non-square (=non-NxN) grid and findChessboardCorners for calibration, and
|
||||
calibrateCamera returns bad values (zero distortion coefficients, an image center very far from
|
||||
(w/2-0.5,h/2-0.5), and/or large differences between \f$f_x\f$ and \f$f_y\f$ (ratios of 10:1 or more)),
|
||||
then you have probably used patternSize=cvSize(rows,cols) instead of using
|
||||
patternSize=cvSize(cols,rows) in findChessboardCorners .
|
||||
If you use a non-square (i.e. non-N-by-N) grid and @ref findChessboardCorners for calibration,
|
||||
and @ref calibrateCamera returns bad values (zero distortion coefficients, \f$c_x\f$ and
|
||||
\f$c_y\f$ very far from the image center, and/or large differences between \f$f_x\f$ and
|
||||
\f$f_y\f$ (ratios of 10:1 or more)), then you are probably using patternSize=cvSize(rows,cols)
|
||||
instead of using patternSize=cvSize(cols,rows) in @ref findChessboardCorners.
|
||||
|
||||
@sa
|
||||
findChessboardCorners, solvePnP, initCameraMatrix2D, stereoCalibrate, undistort
|
||||
@ -1444,27 +1654,34 @@ CV_EXPORTS_W void calibrationMatrixValues( InputArray cameraMatrix, Size imageSi
|
||||
CV_OUT double& focalLength, CV_OUT Point2d& principalPoint,
|
||||
CV_OUT double& aspectRatio );
|
||||
|
||||
/** @brief Calibrates the stereo camera.
|
||||
/** @brief Calibrates a stereo camera set up. This function finds the intrinsic parameters
|
||||
for each of the two cameras and the extrinsic parameters between the two cameras.
|
||||
|
||||
@param objectPoints Vector of vectors of the calibration pattern points.
|
||||
@param objectPoints Vector of vectors of the calibration pattern points. The same structure as
|
||||
in @ref calibrateCamera. For each pattern view, both cameras need to see the same object
|
||||
points. Therefore, objectPoints.size(), imagePoints1.size(), and imagePoints2.size() need to be
|
||||
equal as well as objectPoints[i].size(), imagePoints1[i].size(), and imagePoints2[i].size() need to
|
||||
be equal for each i.
|
||||
@param imagePoints1 Vector of vectors of the projections of the calibration pattern points,
|
||||
observed by the first camera.
|
||||
observed by the first camera. The same structure as in @ref calibrateCamera.
|
||||
@param imagePoints2 Vector of vectors of the projections of the calibration pattern points,
|
||||
observed by the second camera.
|
||||
@param cameraMatrix1 Input/output first camera matrix:
|
||||
\f$\vecthreethree{f_x^{(j)}}{0}{c_x^{(j)}}{0}{f_y^{(j)}}{c_y^{(j)}}{0}{0}{1}\f$ , \f$j = 0,\, 1\f$ . If
|
||||
any of CALIB_USE_INTRINSIC_GUESS , CALIB_FIX_ASPECT_RATIO ,
|
||||
CALIB_FIX_INTRINSIC , or CALIB_FIX_FOCAL_LENGTH are specified, some or all of the
|
||||
matrix components must be initialized. See the flags description for details.
|
||||
@param distCoeffs1 Input/output vector of distortion coefficients
|
||||
\f$(k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6 [, s_1, s_2, s_3, s_4[, \tau_x, \tau_y]]]])\f$ of
|
||||
4, 5, 8, 12 or 14 elements. The output vector length depends on the flags.
|
||||
@param cameraMatrix2 Input/output second camera matrix. The parameter is similar to cameraMatrix1
|
||||
@param distCoeffs2 Input/output lens distortion coefficients for the second camera. The parameter
|
||||
is similar to distCoeffs1 .
|
||||
@param imageSize Size of the image used only to initialize intrinsic camera matrix.
|
||||
@param R Output rotation matrix between the 1st and the 2nd camera coordinate systems.
|
||||
@param T Output translation vector between the coordinate systems of the cameras.
|
||||
observed by the second camera. The same structure as in @ref calibrateCamera.
|
||||
@param cameraMatrix1 Input/output camera matrix for the first camera, the same as in
|
||||
@ref calibrateCamera. Furthermore, for the stereo case, additional flags may be used, see below.
|
||||
@param distCoeffs1 Input/output vector of distortion coefficients, the same as in
|
||||
@ref calibrateCamera.
|
||||
@param cameraMatrix2 Input/output second camera matrix for the second camera. See description for
|
||||
cameraMatrix1.
|
||||
@param distCoeffs2 Input/output lens distortion coefficients for the second camera. See
|
||||
description for distCoeffs1.
|
||||
@param imageSize Size of the image used only to initialize the intrinsic camera matrices.
|
||||
@param R Output rotation matrix. Together with the translation vector T, this matrix brings
|
||||
points given in the first camera's coordinate system to points in the second camera's
|
||||
coordinate system. In more technical terms, the tuple of R and T performs a change of basis
|
||||
from the first camera's coordinate system to the second camera's coordinate system. Due to its
|
||||
duality, this tuple is equivalent to the position of the first camera with respect to the
|
||||
second camera coordinate system.
|
||||
@param T Output translation vector, see description above.
|
||||
@param E Output essential matrix.
|
||||
@param F Output fundamental matrix.
|
||||
@param perViewErrors Output vector of the RMS re-projection error estimated for each pattern view.
|
||||
@ -1473,8 +1690,8 @@ is similar to distCoeffs1 .
|
||||
matrices are estimated.
|
||||
- **CALIB_USE_INTRINSIC_GUESS** Optimize some or all of the intrinsic parameters
|
||||
according to the specified flags. Initial values are provided by the user.
|
||||
- **CALIB_USE_EXTRINSIC_GUESS** R, T contain valid initial values that are optimized further.
|
||||
Otherwise R, T are initialized to the median value of the pattern views (each dimension separately).
|
||||
- **CALIB_USE_EXTRINSIC_GUESS** R and T contain valid initial values that are optimized further.
|
||||
Otherwise R and T are initialized to the median value of the pattern views (each dimension separately).
|
||||
- **CALIB_FIX_PRINCIPAL_POINT** Fix the principal points during the optimization.
|
||||
- **CALIB_FIX_FOCAL_LENGTH** Fix \f$f^{(j)}_x\f$ and \f$f^{(j)}_y\f$ .
|
||||
- **CALIB_FIX_ASPECT_RATIO** Optimize \f$f^{(j)}_y\f$ . Fix the ratio \f$f^{(j)}_x/f^{(j)}_y\f$
|
||||
@ -1505,29 +1722,49 @@ the optimization. If CALIB_USE_INTRINSIC_GUESS is set, the coefficient from the
|
||||
supplied distCoeffs matrix is used. Otherwise, it is set to 0.
|
||||
@param criteria Termination criteria for the iterative optimization algorithm.
|
||||
|
||||
The function estimates transformation between two cameras making a stereo pair. If you have a stereo
|
||||
camera where the relative position and orientation of two cameras is fixed, and if you computed
|
||||
poses of an object relative to the first camera and to the second camera, (R1, T1) and (R2, T2),
|
||||
respectively (this can be done with solvePnP ), then those poses definitely relate to each other.
|
||||
This means that, given ( \f$R_1\f$,\f$T_1\f$ ), it should be possible to compute ( \f$R_2\f$,\f$T_2\f$ ). You only
|
||||
need to know the position and orientation of the second camera relative to the first camera. This is
|
||||
what the described function does. It computes ( \f$R\f$,\f$T\f$ ) so that:
|
||||
The function estimates the transformation between two cameras making a stereo pair. If one computes
|
||||
the poses of an object relative to the first camera and to the second camera,
|
||||
( \f$R_1\f$,\f$T_1\f$ ) and (\f$R_2\f$,\f$T_2\f$), respectively, for a stereo camera where the
|
||||
relative position and orientation between the two cameras are fixed, then those poses definitely
|
||||
relate to each other. This means, if the relative position and orientation (\f$R\f$,\f$T\f$) of the
|
||||
two cameras is known, it is possible to compute (\f$R_2\f$,\f$T_2\f$) when (\f$R_1\f$,\f$T_1\f$) is
|
||||
given. This is what the described function does. It computes (\f$R\f$,\f$T\f$) such that:
|
||||
|
||||
\f[R_2=R R_1\f]
|
||||
\f[T_2=R T_1 + T.\f]
|
||||
|
||||
Therefore, one can compute the coordinate representation of a 3D point for the second camera's
|
||||
coordinate system when given the point's coordinate representation in the first camera's coordinate
|
||||
system:
|
||||
|
||||
\f[\begin{bmatrix}
|
||||
X_2 \\
|
||||
Y_2 \\
|
||||
Z_2 \\
|
||||
1
|
||||
\end{bmatrix} = \begin{bmatrix}
|
||||
R & T \\
|
||||
0 & 1
|
||||
\end{bmatrix} \begin{bmatrix}
|
||||
X_1 \\
|
||||
Y_1 \\
|
||||
Z_1 \\
|
||||
1
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
\f[R_2=R*R_1\f]
|
||||
\f[T_2=R*T_1 + T,\f]
|
||||
|
||||
Optionally, it computes the essential matrix E:
|
||||
|
||||
\f[E= \vecthreethree{0}{-T_2}{T_1}{T_2}{0}{-T_0}{-T_1}{T_0}{0} *R\f]
|
||||
\f[E= \vecthreethree{0}{-T_2}{T_1}{T_2}{0}{-T_0}{-T_1}{T_0}{0} R\f]
|
||||
|
||||
where \f$T_i\f$ are components of the translation vector \f$T\f$ : \f$T=[T_0, T_1, T_2]^T\f$ . And the function
|
||||
can also compute the fundamental matrix F:
|
||||
where \f$T_i\f$ are components of the translation vector \f$T\f$ : \f$T=[T_0, T_1, T_2]^T\f$ .
|
||||
And the function can also compute the fundamental matrix F:
|
||||
|
||||
\f[F = cameraMatrix2^{-T} E cameraMatrix1^{-1}\f]
|
||||
|
||||
Besides the stereo-related information, the function can also perform a full calibration of each of
|
||||
two cameras. However, due to the high dimensionality of the parameter space and noise in the input
|
||||
data, the function can diverge from the correct solution. If the intrinsic parameters can be
|
||||
the two cameras. However, due to the high dimensionality of the parameter space and noise in the
|
||||
input data, the function can diverge from the correct solution. If the intrinsic parameters can be
|
||||
estimated with high accuracy for each of the cameras individually (for example, using
|
||||
calibrateCamera ), you are recommended to do so and then pass CALIB_FIX_INTRINSIC flag to the
|
||||
function along with the computed intrinsic parameters. Otherwise, if all the parameters are
|
||||
@ -1563,15 +1800,25 @@ CV_EXPORTS_W double stereoCalibrate( InputArrayOfArrays objectPoints,
|
||||
@param cameraMatrix2 Second camera matrix.
|
||||
@param distCoeffs2 Second camera distortion parameters.
|
||||
@param imageSize Size of the image used for stereo calibration.
|
||||
@param R Rotation matrix from the coordinate system of the first camera to the second.
|
||||
@param T Translation vector from the coordinate system of the first camera to the second.
|
||||
@param R1 Output 3x3 rectification transform (rotation matrix) for the first camera.
|
||||
@param R2 Output 3x3 rectification transform (rotation matrix) for the second camera.
|
||||
@param R Rotation matrix from the coordinate system of the first camera to the second camera,
|
||||
see @ref stereoCalibrate.
|
||||
@param T Translation vector from the coordinate system of the first camera to the second camera,
|
||||
see @ref stereoCalibrate.
|
||||
@param R1 Output 3x3 rectification transform (rotation matrix) for the first camera. This matrix
|
||||
brings points given in the unrectified first camera's coordinate system to points in the rectified
|
||||
first camera's coordinate system. In more technical terms, it performs a change of basis from the
|
||||
unrectified first camera's coordinate system to the rectified first camera's coordinate system.
|
||||
@param R2 Output 3x3 rectification transform (rotation matrix) for the second camera. This matrix
|
||||
brings points given in the unrectified second camera's coordinate system to points in the rectified
|
||||
second camera's coordinate system. In more technical terms, it performs a change of basis from the
|
||||
unrectified second camera's coordinate system to the rectified second camera's coordinate system.
|
||||
@param P1 Output 3x4 projection matrix in the new (rectified) coordinate systems for the first
|
||||
camera.
|
||||
camera, i.e. it projects points given in the rectified first camera coordinate system into the
|
||||
rectified first camera's image.
|
||||
@param P2 Output 3x4 projection matrix in the new (rectified) coordinate systems for the second
|
||||
camera.
|
||||
@param Q Output \f$4 \times 4\f$ disparity-to-depth mapping matrix (see reprojectImageTo3D ).
|
||||
camera, i.e. it projects points given in the rectified first camera coordinate system into the
|
||||
rectified second camera's image.
|
||||
@param Q Output \f$4 \times 4\f$ disparity-to-depth mapping matrix (see @ref reprojectImageTo3D).
|
||||
@param flags Operation flags that may be zero or CALIB_ZERO_DISPARITY . If the flag is set,
|
||||
the function makes the principal points of each camera have the same pixel coordinates in the
|
||||
rectified views. And if the flag is not set, the function may still shift the images in the
|
||||
@ -1582,11 +1829,11 @@ scaling. Otherwise, the parameter should be between 0 and 1. alpha=0 means that
|
||||
images are zoomed and shifted so that only valid pixels are visible (no black areas after
|
||||
rectification). alpha=1 means that the rectified image is decimated and shifted so that all the
|
||||
pixels from the original images from the cameras are retained in the rectified images (no source
|
||||
image pixels are lost). Obviously, any intermediate value yields an intermediate result between
|
||||
image pixels are lost). Any intermediate value yields an intermediate result between
|
||||
those two extreme cases.
|
||||
@param newImageSize New image resolution after rectification. The same size should be passed to
|
||||
initUndistortRectifyMap (see the stereo_calib.cpp sample in OpenCV samples directory). When (0,0)
|
||||
is passed (default), it is set to the original imageSize . Setting it to larger value can help you
|
||||
is passed (default), it is set to the original imageSize . Setting it to a larger value can help you
|
||||
preserve details in the original image, especially when there is a big radial distortion.
|
||||
@param validPixROI1 Optional output rectangles inside the rectified images where all the pixels
|
||||
are valid. If alpha=0 , the ROIs cover the whole images. Otherwise, they are likely to be smaller
|
||||
@ -1602,27 +1849,43 @@ as input. As output, it provides two rotation matrices and also two projection m
|
||||
coordinates. The function distinguishes the following two cases:
|
||||
|
||||
- **Horizontal stereo**: the first and the second camera views are shifted relative to each other
|
||||
mainly along the x axis (with possible small vertical shift). In the rectified images, the
|
||||
mainly along the x-axis (with possible small vertical shift). In the rectified images, the
|
||||
corresponding epipolar lines in the left and right cameras are horizontal and have the same
|
||||
y-coordinate. P1 and P2 look like:
|
||||
|
||||
\f[\texttt{P1} = \begin{bmatrix} f & 0 & cx_1 & 0 \\ 0 & f & cy & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}\f]
|
||||
\f[\texttt{P1} = \begin{bmatrix}
|
||||
f & 0 & cx_1 & 0 \\
|
||||
0 & f & cy & 0 \\
|
||||
0 & 0 & 1 & 0
|
||||
\end{bmatrix}\f]
|
||||
|
||||
\f[\texttt{P2} = \begin{bmatrix} f & 0 & cx_2 & T_x*f \\ 0 & f & cy & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix} ,\f]
|
||||
\f[\texttt{P2} = \begin{bmatrix}
|
||||
f & 0 & cx_2 & T_x*f \\
|
||||
0 & f & cy & 0 \\
|
||||
0 & 0 & 1 & 0
|
||||
\end{bmatrix} ,\f]
|
||||
|
||||
where \f$T_x\f$ is a horizontal shift between the cameras and \f$cx_1=cx_2\f$ if
|
||||
CALIB_ZERO_DISPARITY is set.
|
||||
|
||||
- **Vertical stereo**: the first and the second camera views are shifted relative to each other
|
||||
mainly in vertical direction (and probably a bit in the horizontal direction too). The epipolar
|
||||
mainly in the vertical direction (and probably a bit in the horizontal direction too). The epipolar
|
||||
lines in the rectified images are vertical and have the same x-coordinate. P1 and P2 look like:
|
||||
|
||||
\f[\texttt{P1} = \begin{bmatrix} f & 0 & cx & 0 \\ 0 & f & cy_1 & 0 \\ 0 & 0 & 1 & 0 \end{bmatrix}\f]
|
||||
\f[\texttt{P1} = \begin{bmatrix}
|
||||
f & 0 & cx & 0 \\
|
||||
0 & f & cy_1 & 0 \\
|
||||
0 & 0 & 1 & 0
|
||||
\end{bmatrix}\f]
|
||||
|
||||
\f[\texttt{P2} = \begin{bmatrix} f & 0 & cx & 0 \\ 0 & f & cy_2 & T_y*f \\ 0 & 0 & 1 & 0 \end{bmatrix} ,\f]
|
||||
\f[\texttt{P2} = \begin{bmatrix}
|
||||
f & 0 & cx & 0 \\
|
||||
0 & f & cy_2 & T_y*f \\
|
||||
0 & 0 & 1 & 0
|
||||
\end{bmatrix},\f]
|
||||
|
||||
where \f$T_y\f$ is a vertical shift between the cameras and \f$cy_1=cy_2\f$ if CALIB_ZERO_DISPARITY is
|
||||
set.
|
||||
where \f$T_y\f$ is a vertical shift between the cameras and \f$cy_1=cy_2\f$ if
|
||||
CALIB_ZERO_DISPARITY is set.
|
||||
|
||||
As you can see, the first three columns of P1 and P2 will effectively be the new "rectified" camera
|
||||
matrices. The matrices, together with R1 and R2 , can then be passed to initUndistortRectifyMap to
|
||||
@ -2029,35 +2292,47 @@ CV_EXPORTS_W Mat findEssentialMat( InputArray points1, InputArray points2,
|
||||
@param R2 Another possible rotation matrix.
|
||||
@param t One possible translation.
|
||||
|
||||
This function decompose an essential matrix E using svd decomposition @cite HartleyZ00 . Generally 4
|
||||
possible poses exists for a given E. They are \f$[R_1, t]\f$, \f$[R_1, -t]\f$, \f$[R_2, t]\f$, \f$[R_2, -t]\f$. By
|
||||
decomposing E, you can only get the direction of the translation, so the function returns unit t.
|
||||
This function decomposes the essential matrix E using svd decomposition @cite HartleyZ00. In
|
||||
general, four possible poses exist for the decomposition of E. They are \f$[R_1, t]\f$,
|
||||
\f$[R_1, -t]\f$, \f$[R_2, t]\f$, \f$[R_2, -t]\f$.
|
||||
|
||||
If E gives the epipolar constraint \f$[p_2; 1]^T A^{-T} E A^{-1} [p_1; 1] = 0\f$ between the image
|
||||
points \f$p_1\f$ in the first image and \f$p_2\f$ in second image, then any of the tuples
|
||||
\f$[R_1, t]\f$, \f$[R_1, -t]\f$, \f$[R_2, t]\f$, \f$[R_2, -t]\f$ is a change of basis from the first
|
||||
camera's coordinate system to the second camera's coordinate system. However, by decomposing E, one
|
||||
can only get the direction of the translation. For this reason, the translation t is returned with
|
||||
unit length.
|
||||
*/
|
||||
CV_EXPORTS_W void decomposeEssentialMat( InputArray E, OutputArray R1, OutputArray R2, OutputArray t );
|
||||
|
||||
/** @brief Recover relative camera rotation and translation from an estimated essential matrix and the
|
||||
corresponding points in two images, using cheirality check. Returns the number of inliers which pass
|
||||
the check.
|
||||
/** @brief Recovers the relative camera rotation and the translation from an estimated essential
|
||||
matrix and the corresponding points in two images, using cheirality check. Returns the number of
|
||||
inliers that pass the check.
|
||||
|
||||
@param E The input essential matrix.
|
||||
@param points1 Array of N 2D points from the first image. The point coordinates should be
|
||||
floating-point (single or double precision).
|
||||
@param points2 Array of the second image points of the same size and format as points1 .
|
||||
@param cameraMatrix Camera matrix \f$K = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
|
||||
@param cameraMatrix Camera matrix \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
|
||||
Note that this function assumes that points1 and points2 are feature points from cameras with the
|
||||
same camera matrix.
|
||||
@param R Recovered relative rotation.
|
||||
@param t Recovered relative translation.
|
||||
@param mask Input/output mask for inliers in points1 and points2.
|
||||
: If it is not empty, then it marks inliers in points1 and points2 for then given essential
|
||||
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
|
||||
which pass the cheirality check.
|
||||
This function decomposes an essential matrix using decomposeEssentialMat and then verifies possible
|
||||
pose hypotheses by doing cheirality check. The cheirality check basically means that the
|
||||
@param R Output rotation matrix. Together with the translation vector, this matrix makes up a tuple
|
||||
that performs a change of basis from the first camera's coordinate system to the second camera's
|
||||
coordinate system. Note that, in general, t can not be used for this tuple, see the parameter
|
||||
described below.
|
||||
@param t Output translation vector. This vector is obtained by @ref decomposeEssentialMat and
|
||||
therefore is only known up to scale, i.e. t is the direction of the translation vector and has unit
|
||||
length.
|
||||
@param mask Input/output mask for inliers in points1 and points2. If it is not empty, then it marks
|
||||
inliers in points1 and points2 for then given essential matrix E. Only these inliers will be used to
|
||||
recover pose. In the output mask only inliers which pass the cheirality check.
|
||||
|
||||
This function decomposes an essential matrix using @ref decomposeEssentialMat and then verifies
|
||||
possible pose hypotheses by doing cheirality check. The cheirality check means that the
|
||||
triangulated 3D points should have positive depth. Some details can be found in @cite Nister03.
|
||||
|
||||
This function can be used to process output E and mask from findEssentialMat. In this scenario,
|
||||
points1 and points2 are the same input for findEssentialMat. :
|
||||
This function can be used to process the output E and mask from @ref findEssentialMat. In this
|
||||
scenario, points1 and points2 are the same input for findEssentialMat.:
|
||||
@code
|
||||
// Example. Estimation of fundamental matrix using the RANSAC algorithm
|
||||
int point_count = 100;
|
||||
@ -2089,20 +2364,24 @@ CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray point
|
||||
@param points1 Array of N 2D points from the first image. The point coordinates should be
|
||||
floating-point (single or double precision).
|
||||
@param points2 Array of the second image points of the same size and format as points1 .
|
||||
@param R Recovered relative rotation.
|
||||
@param t Recovered relative translation.
|
||||
@param R Output rotation matrix. Together with the translation vector, this matrix makes up a tuple
|
||||
that performs a change of basis from the first camera's coordinate system to the second camera's
|
||||
coordinate system. Note that, in general, t can not be used for this tuple, see the parameter
|
||||
description below.
|
||||
@param t Output translation vector. This vector is obtained by @ref decomposeEssentialMat and
|
||||
therefore is only known up to scale, i.e. t is the direction of the translation vector and has unit
|
||||
length.
|
||||
@param focal Focal length of the camera. Note that this function assumes that points1 and points2
|
||||
are feature points from cameras with same focal length and principal point.
|
||||
@param pp principal point of the camera.
|
||||
@param mask Input/output mask for inliers in points1 and points2.
|
||||
: If it is not empty, then it marks inliers in points1 and points2 for then given essential
|
||||
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
|
||||
which pass the cheirality check.
|
||||
@param mask Input/output mask for inliers in points1 and points2. If it is not empty, then it marks
|
||||
inliers in points1 and points2 for then given essential matrix E. Only these inliers will be used to
|
||||
recover pose. In the output mask only inliers which pass the cheirality check.
|
||||
|
||||
This function differs from the one above that it computes camera matrix from focal length and
|
||||
principal point:
|
||||
|
||||
\f[K =
|
||||
\f[A =
|
||||
\begin{bmatrix}
|
||||
f & 0 & x_{pp} \\
|
||||
0 & f & y_{pp} \\
|
||||
@ -2119,19 +2398,26 @@ CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray point
|
||||
@param points1 Array of N 2D points from the first image. The point coordinates should be
|
||||
floating-point (single or double precision).
|
||||
@param points2 Array of the second image points of the same size and format as points1.
|
||||
@param cameraMatrix Camera matrix \f$K = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
|
||||
@param cameraMatrix Camera matrix \f$A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
|
||||
Note that this function assumes that points1 and points2 are feature points from cameras with the
|
||||
same camera matrix.
|
||||
@param R Recovered relative rotation.
|
||||
@param t Recovered relative translation.
|
||||
@param distanceThresh threshold distance which is used to filter out far away points (i.e. infinite points).
|
||||
@param mask Input/output mask for inliers in points1 and points2.
|
||||
: If it is not empty, then it marks inliers in points1 and points2 for then given essential
|
||||
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
|
||||
which pass the cheirality check.
|
||||
@param triangulatedPoints 3d points which were reconstructed by triangulation.
|
||||
*/
|
||||
@param R Output rotation matrix. Together with the translation vector, this matrix makes up a tuple
|
||||
that performs a change of basis from the first camera's coordinate system to the second camera's
|
||||
coordinate system. Note that, in general, t can not be used for this tuple, see the parameter
|
||||
description below.
|
||||
@param t Output translation vector. This vector is obtained by @ref decomposeEssentialMat and
|
||||
therefore is only known up to scale, i.e. t is the direction of the translation vector and has unit
|
||||
length.
|
||||
@param distanceThresh threshold distance which is used to filter out far away points (i.e. infinite
|
||||
points).
|
||||
@param mask Input/output mask for inliers in points1 and points2. If it is not empty, then it marks
|
||||
inliers in points1 and points2 for then given essential matrix E. Only these inliers will be used to
|
||||
recover pose. In the output mask only inliers which pass the cheirality check.
|
||||
@param triangulatedPoints 3D points which were reconstructed by triangulation.
|
||||
|
||||
This function differs from the one above that it outputs the triangulated 3D point that are used for
|
||||
the cheirality check.
|
||||
*/
|
||||
CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray points2,
|
||||
InputArray cameraMatrix, OutputArray R, OutputArray t, double distanceThresh, InputOutputArray mask = noArray(),
|
||||
OutputArray triangulatedPoints = noArray());
|
||||
@ -2162,22 +2448,27 @@ Line coefficients are defined up to a scale. They are normalized so that \f$a_i^
|
||||
CV_EXPORTS_W void computeCorrespondEpilines( InputArray points, int whichImage,
|
||||
InputArray F, OutputArray lines );
|
||||
|
||||
/** @brief Reconstructs points by triangulation.
|
||||
/** @brief This function reconstructs 3-dimensional points (in homogeneous coordinates) by using
|
||||
their observations with a stereo camera.
|
||||
|
||||
@param projMatr1 3x4 projection matrix of the first camera.
|
||||
@param projMatr2 3x4 projection matrix of the second camera.
|
||||
@param projPoints1 2xN array of feature points in the first image. In case of c++ version it can
|
||||
be also a vector of feature points or two-channel matrix of size 1xN or Nx1.
|
||||
@param projPoints2 2xN array of corresponding points in the second image. In case of c++ version
|
||||
@param projMatr1 3x4 projection matrix of the first camera, i.e. this matrix projects 3D points
|
||||
given in the world's coordinate system into the first image.
|
||||
@param projMatr2 3x4 projection matrix of the second camera, i.e. this matrix projects 3D points
|
||||
given in the world's coordinate system into the second image.
|
||||
@param projPoints1 2xN array of feature points in the first image. In the case of the c++ version,
|
||||
it can be also a vector of feature points or two-channel matrix of size 1xN or Nx1.
|
||||
@param points4D 4xN array of reconstructed points in homogeneous coordinates.
|
||||
|
||||
The function reconstructs 3-dimensional points (in homogeneous coordinates) by using their
|
||||
observations with a stereo camera. Projections matrices can be obtained from stereoRectify.
|
||||
@param projPoints2 2xN array of corresponding points in the second image. In the case of the c++
|
||||
version, it can be also a vector of feature points or two-channel matrix of size 1xN or Nx1.
|
||||
@param points4D 4xN array of reconstructed points in homogeneous coordinates. These points are
|
||||
returned in the world's coordinate system.
|
||||
|
||||
@note
|
||||
Keep in mind that all input data should be of float type in order for this function to work.
|
||||
|
||||
@note
|
||||
If the projection matrices from @ref stereoRectify are used, then the returned points are
|
||||
represented in the first camera's rectified coordinate system.
|
||||
|
||||
@sa
|
||||
reprojectImageTo3D
|
||||
*/
|
||||
@ -2232,15 +2523,16 @@ CV_EXPORTS_W void validateDisparity( InputOutputArray disparity, InputArray cost
|
||||
/** @brief Reprojects a disparity image to 3D space.
|
||||
|
||||
@param disparity Input single-channel 8-bit unsigned, 16-bit signed, 32-bit signed or 32-bit
|
||||
floating-point disparity image.
|
||||
The values of 8-bit / 16-bit signed formats are assumed to have no fractional bits.
|
||||
If the disparity is 16-bit signed format as computed by
|
||||
StereoBM/StereoSGBM/StereoBinaryBM/StereoBinarySGBM and may be other algorithms,
|
||||
it should be divided by 16 (and scaled to float) before being used here.
|
||||
@param _3dImage Output 3-channel floating-point image of the same size as disparity . Each
|
||||
element of _3dImage(x,y) contains 3D coordinates of the point (x,y) computed from the disparity
|
||||
map.
|
||||
@param Q \f$4 \times 4\f$ perspective transformation matrix that can be obtained with stereoRectify.
|
||||
floating-point disparity image. The values of 8-bit / 16-bit signed formats are assumed to have no
|
||||
fractional bits. If the disparity is 16-bit signed format, as computed by @ref StereoBM or
|
||||
@ref StereoSGBM and maybe other algorithms, it should be divided by 16 (and scaled to float) before
|
||||
being used here.
|
||||
@param _3dImage Output 3-channel floating-point image of the same size as disparity. Each element of
|
||||
_3dImage(x,y) contains 3D coordinates of the point (x,y) computed from the disparity map. If one
|
||||
uses Q obtained by @ref stereoRectify, then the returned points are represented in the first
|
||||
camera's rectified coordinate system.
|
||||
@param Q \f$4 \times 4\f$ perspective transformation matrix that can be obtained with
|
||||
@ref stereoRectify.
|
||||
@param handleMissingValues Indicates, whether the function should handle missing values (i.e.
|
||||
points where the disparity was not computed). If handleMissingValues=true, then pixels with the
|
||||
minimal disparity that corresponds to the outliers (see StereoMatcher::compute ) are transformed
|
||||
@ -2252,11 +2544,20 @@ The function transforms a single-channel disparity map to a 3-channel image repr
|
||||
surface. That is, for each pixel (x,y) and the corresponding disparity d=disparity(x,y) , it
|
||||
computes:
|
||||
|
||||
\f[\begin{array}{l} [X \; Y \; Z \; W]^T = \texttt{Q} *[x \; y \; \texttt{disparity} (x,y) \; 1]^T \\ \texttt{\_3dImage} (x,y) = (X/W, \; Y/W, \; Z/W) \end{array}\f]
|
||||
\f[\begin{bmatrix}
|
||||
X \\
|
||||
Y \\
|
||||
Z \\
|
||||
W
|
||||
\end{bmatrix} = Q \begin{bmatrix}
|
||||
x \\
|
||||
y \\
|
||||
\texttt{disparity} (x,y) \\
|
||||
z
|
||||
\end{bmatrix}.\f]
|
||||
|
||||
The matrix Q can be an arbitrary \f$4 \times 4\f$ matrix (for example, the one computed by
|
||||
stereoRectify). To reproject a sparse set of points {(x,y,d),...} to 3D space, use
|
||||
perspectiveTransform .
|
||||
@sa
|
||||
To reproject a sparse set of points {(x,y,d),...} to 3D space, use perspectiveTransform.
|
||||
*/
|
||||
CV_EXPORTS_W void reprojectImageTo3D( InputArray disparity,
|
||||
OutputArray _3dImage, InputArray Q,
|
||||
@ -2463,11 +2764,19 @@ Check @ref tutorial_homography "the corresponding tutorial" for more details.
|
||||
@param translations Array of translation matrices.
|
||||
@param normals Array of plane normal matrices.
|
||||
|
||||
This function extracts relative camera motion between two views observing a planar object from the
|
||||
homography H induced by the plane. The intrinsic camera matrix K must also be provided. The function
|
||||
may return up to four mathematical solution sets. At least two of the solutions may further be
|
||||
invalidated if point correspondences are available by applying positive depth constraint (all points
|
||||
must be in front of the camera). The decomposition method is described in detail in @cite Malis .
|
||||
This function extracts relative camera motion between two views of a planar object and returns up to
|
||||
four mathematical solution tuples of rotation, translation, and plane normal. The decomposition of
|
||||
the homography matrix H is described in detail in @cite Malis.
|
||||
|
||||
If the homography H, induced by the plane, gives the constraint
|
||||
\f[s_i \vecthree{x'_i}{y'_i}{1} \sim H \vecthree{x_i}{y_i}{1}\f] on the source image points
|
||||
\f$p_i\f$ and the destination image points \f$p'_i\f$, then the tuple of rotations[k] and
|
||||
translations[k] is a change of basis from the source camera's coordinate system to the destination
|
||||
camera's coordinate system. However, by decomposing H, one can only get the translation normalized
|
||||
by the (typically unknown) depth of the scene, i.e. its direction but with normalized length.
|
||||
|
||||
If point correspondences are available, at least two solutions may further be invalidated, by
|
||||
applying positive depth constraint, i.e. all points must be in front of the camera.
|
||||
*/
|
||||
CV_EXPORTS_W int decomposeHomographyMat(InputArray H,
|
||||
InputArray K,
|
||||
|
Loading…
Reference in New Issue
Block a user