Computer Vision
Professor William Hoff

Professor William Hoff
Dept of Electrical Engineering &Computer Science
Camera Calibration
Camera Calibration
• Needed for most machine vision and photogrammetry tasks
(object recognition, pose estimation, …).
• Calibration means estimating extrinsic (external) and intrinsic
(internal) parameters
• Extrinsic parameters
– position and orientation (pose) of camera
• Intrinsic parameters
Focal length
Pixel size
Distortion coefficients
Image center
• In this lecture we will concentrate on finding the intrinsic
Example Use of Calibration
• Given a point in an image, what ray emanating from the
camera center corresponds to that pixel?
• Or, given a 3D point in the scene, what pixel does it project
We know how to do
this if the camera is a
pinhole camera:
x = f X/Z, y = f Y/Z
By intersecting the ray with the ground plane, you
can estimate the position of the point
• The use of a calibration pattern or set of markers is one of the more
reliable ways to estimate a camera’s intrinsic parameters
– A planar target is often used (it is easy to make), and you take multiple
images at different poses
• You can move the target in a controlled way (where the relative
poses of the target are known) or just move it in an uncontrolled
way (where you have to solve for the poses)
– It is best if the calibration object spans as much of the image as possible
• The strategy is to first solve for all calibration parameters except
lens distortion by assuming there is no distortion
– Then perform a final nonlinear optimization which includes solving for lens
Example Calibration Patterns
• Geometry of target is known
• Pose of target is not known
Images from Sebastian Thrun 6
Simple Perspective Projection
 
 
 
 v   M Z 
 w
 
 
• A 3x4 camera projection matrix M
projects 3D points onto 2D image
• This matrix models:
rotation and translation
focal length
ratio of pixel height and width
image center
• It doesn’t model lens distortion
2D image
point (x,y)
3D point in
world coords
x  u / w,
y v/w
M  K M ext
3x3 intrinsic
3x4 extrinsic parameter
matrix, contains world
to camera pose
Perspective projection
 u   m11 m12
  
 v    m21 m22
 w  m
   31 m32
m14  
 Y 
m24  
m34  
• If the target is planar, we can use Z=0 for all points on the target
• So equation simplifies to
 u   m11 m12
  
 v    m21 m22
 w  m
   31 m32
m14  X 
 
m24  Y 
m34  1 
• This 3x3 matrix is a homography that maps the planar target’s
calibration points (X, Y, 0) into image coordinates (x,y), where x=u/w,
Perspective projection of a plane (continued)
• We can solve for the 9 unknowns (m11..m34) by observing a set of known
points on a calibration planar target, then do least squares fitting
– This is called the “direct linear transformation” (DLT)
• To do this, write:
u m11 X  m12Y  m14
x 
w m31 X  m32Y  m34
v m21 X  m22Y  m14
y 
w m31 X  m32Y  m34
• Then
m11 X  m12Y  m14  x m31 X  m32Y  m34   0
m21 X  m22Y  m14  y m31 X  m32Y  m34   0
Direct Linear Transformation
We collect all the unknowns (mij) into a vector (9x1), so Am=0
Am = 0 is a homogeneous system of equations
You can only solve for m up to an unknown scale factor
The solution is the eigenvector corresponding to the zero
eigenvalue of ATA
• You can also find this using Singular Value Decomposition
• Recall that we can take the SVD of A; ie., A = U D VT
– And x is the column of V corresponding to the zero singular value of A
– (Since the columns are ordered, this is the rightmost column of V)
• We can repeat for other poses
Extracting parameters
• We next extract the intrinsic and extrinsic camera parameters
from M, where M = K Mext
– The intrinsic parameters (in K) are fx, fy, cx, cy
– The extrinsic parameters (in Mext) are r11, r12, …, r33, tx, ty, tz
• The procedure requires some linear algebra but is fairly
straightforward. It is described clearly in:
Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Trans
on Pattern Analysis &Machine Intel, 22(11):1330–1334.
• The final result is that we have solved for all intrinsic and
extrinsic parameters except lens distortion
Lens Distortion
• Lens distortion - projected points do not follow the
simple pinhole camera formula
• Most common is barrel distortion and pin-cushion
– Points are displaced radially inward (barrel) or outward
(pincushion) from correct position
– Tangential distortions are also possible
• There are other types of lens aberrations that we
won’t consider
– Spherical, coma, astigmatism (these blur the point)
– Chromatic aberrations (color affects focal length)
Radial Distortion Examples
wideangle (barrel)
telephoto (pincushion)
Image from Sebastian Thrun 13
Camera Model
• We use the model of Heikkil and Silven at the
University of Oulu in Finland
• Also in the Matlab camera calibration toolbox by
Strobl, et al1
• Usual convention of x increasing left to right, y
increasing top to bottom
– the top left pixel is the origin
Colorado School of Mines
    (1  k1r 2  k2 r 4  k5r 6 )
 yd   y 
where r  distance from center, r 2  x 2  y 2
Model of
 2k3 xy  k4 r 2  2 x 2 
dx  
k3 r  2 y   2k4 xy 
Model of
Complete Projection Model
• A 3D point is projected onto the image
plane (x,y) using the pinhole model
• Then the point is distorted using
 xd   x 
    (1  k1r 2  k2 r 4  k5r 6 )  dx
 yd   y 
• Then compute pixel location
– Note: f(1) and f(2) are focal lengths
expressed in units of horizontal and
vertical pixels. Both components are
usually very similar.
– The ratio f(2)/f(1), often called "aspect
ratio", is different from 1 if the pixels in
the CCD array are not square
 x p   f1 xd  cc1 
   
 y p   f 2 yd  cc2 
• There are a total of 9 parameters to be
– f1,f2, cc1, cc2, k1..k5
Solving for Parameters
• Typically an initial guess for
pose and focal length is
found, ignoring lens
• Then non-linear
optimization to find all
parameters (such as
iterative least squares) …
this minimizes image
residual errors
from Heikkilä and Silvén
How Many Images are Needed?
• Unknowns:
– There are 9 intrinsic parameters
– 6 additional parameters for pose of each viewpoint
• Given:
– K images with M points each
– Each point yields two equations
• So 9+6K unknowns and 2KM equations
• Need 2KM > 9+6K
– Example: 3 viewpoints => 27 unknowns
– Need at least 5 corners per image
– Of course, more are better
Matlab Camera Calibration Toolbox
• See
• Requires manual initial picking of corners in images
• Software finds pose of target in each image; and intrinsic
• Can rectify entire image
– This will undo lens distortion
– Creates an image as if it were taken with a pinhole camera
• Or, use the normalize function from the toolbox to
undistort a single point
Minor fix to code
• If you are having memory problems, try this
• In file “apply_distortion.m”, comment out these lines
% aa = (2*k(3)*x(2,:)+6*k(4)*x(1,:))'*ones(1,3);
% bb = (2*k(3)*x(1,:)+2*k(4)*x(2,:))'*ones(1,3);
% cc = (6*k(3)*x(2,:)+2*k(4)*x(1,:))'*ones(1,3);
• Code doesn’t use these variables
• If you don’t comment them out, it may run out of memory on
large images
