Following [
4
], we consider a stereo pair composed of two pinhole cameras, each
modelled by its
optical center
and its
retinal plane
(or
image plane
)
. In each camera, a point
in 3-D space is projected into an image point
, which is the intersection of the line
with
. The transformation from
to
is modelled by the linear transformation
in projective (or homogeneous) coordinate:
where
The points
for which
S
=0 define the
focal plane
and are projected to infinity. The projection matrix
can be decomposed into the product
.
maps from world to camera coordinates and depends on the extrinsic
parameters of the stereo rig only;
, which maps from camera to pixel coordinates and depends on the
intrinsic parameters only, has the following form:
where
f
is the focal length in millimeters,
are the scale factors along the
u
and
v
axes respectively (the number of pixels per millimiter), and
, and
are the focal lengths in horizontal and vertical pixels, respectively.
If we write the projection matrix as
we see that the plane
(
S
=0) is the focal plane, and the two planes
and
intersect the retinal plane in the vertical (
U
=0) and horizontal (
V
=0) axis of the retinal coordinates, respectively.
The
optical center
,
, is the intersection of the three planes introduced in the previous
paragraph; therefore
, and
. The
optical ray
associated to an image point
is the line
, i.e. the set of points
. The equation of this ray can be written in parametric form as
.
Adrian F Clark