Transcript Document

3D Geometry and
Camera Calibration
3D Coordinate Systems
• Right-handed vs. left-handed
x
y
x
z
z
y
2D Coordinate Systems
• y axis up vs. y axis down
• Origin at center vs. corner
• Will often write (u, v) for image coordinates
u
v
v
u
v
u
3D Geometry Basics
• 3D points = column vectors
 x
  
p   y
z
 
• Transformations = pre-multiplied matrices
a b
 
Tp   d e
g h

c  x 
 
f  y 
i  z 
Rotation
• Rotation about the z axis
 cos

R z   sin 
 0

 sin 
cos
0
• Rotation about x, y axes similar
(cyclically permute x, y, z)
0

0
1 
Arbitrary Rotation
• Any rotation is a composition of rotations about
x, y, and z
• Composition of transformations =
matrix multiplication (watch the order!)
• Result: orthonormal matrix
– Each row, column has unit length
– Dot product of rows or columns = 0
– Inverse of matrix = transpose
Arbitrary Rotation
• Rotate around x, y, then z:
 cos y cos z

R   cos y sin  z
  sin 
y

 cos x sin  z  sin  x sin  y cos z
cos x cos z  sin  x sin  y sin  z
sin  x cos y
sin  x sin  z  cos x sin  y cos z 

 sin  x cos z  cos x sin  y sin  z 

cos x cos y

• Don’t do this! Compute simple matrices and
multiply them!
Scale
• Scale in x, y, z:
 sx

S 0
0

0
sy
0
0

0
s z 
Shear
• Shear parallel to xy plane:
1 0  x 


σ xy   0 1  y 
0 0 1 


Translation
• Can translation be represented by multiplying
by a 33 matrix?
• No.
• Proof:
 
A : A0  0
Homogeneous Coordinates
• Add a fourth dimension to each point:
 x
 x  
   y
 y   
z  z
   w
 
• To get “real” (3D) coordinates, divide by w:
 x x 
   w
 y y 
 z    w
   z 
 w
   w
Translation in
Homogeneous Coordinates
1

0
0

0

0 0 t x  x   x  t x w 
  

1 0 t y  y   y  t y w 




0 1 tz z
z  tz w 
  





0 0 1  w   w 
• After divide by w, this is just a translation
by (tx , ty , tz)
Perspective Projection
• What does 4th row of matrix do?
1

0
0

0

0 0 0  x   x 
   
1 0 0  y   y 
 



0 1 0 z
z
   
0 1 0  w   z 
• After divide,
 x  x 
   z
 y  y 
 z  z 
   1 
 z

  
Perspective Projection
• This is projection onto the z=1 plane
(x,y,z)
(x/z,y/z,1)
(0,0,0)
z=1
• Add scaling, etc.  pinhole camera model
Putting It All Together:
A Camera Model
Scale to
pixel size
Translate
to image
center
Camera
orientation
Perspective
projection
3D point

TimgS pixPcamR camTcam x
Then perform
homogeneous
divide, and
get (u,v) coords
Camera
location
(homogeneous coords)
Putting It All Together:
A Camera Model
Intrinsics
Extrinsics

TimgS pixPcamR camTcam x
Putting It All Together:
A Camera Model
Camera coordinates
Normalized device coordinates
Eye coordinates
Image coordinates
Pixel coordinates
World coordinates

TimgS pixPcamR camTcam x
More General Camera Model
• Multiply all these matrices together
• Don’t care about “z” after transformation
a

e


i

b
f

j
c
g

k
 ax  by  cz  d 


d  x 
  homogeneous  ix  jy  kz  l 
h  y 
 ex  fy  gz  h 
 ix  jy  kz  l 
  z  
divide
 






l  1 




• Scale ambiguity  11 free parameters
Radial Distortion
• Radial distortion can not be represented
by matrix
uimg  cu  u
vimg  cv  v
*
img
*
img
1  k (u
1  k (u
* 2
img
* 2
img
* 2
img
v
* 2
img
v

)
)
• (cu, cv) is image center,
u*img= uimg– cu, v*img= vimg– cv,
k is first-order radial distortion coefficient
Camera Calibration
• Determining values for camera parameters
• Necessary for any algorithm that requires
3D  2D mapping
• Method used depends on:
– What data is available
– Intrinsics only vs. extrinsics only vs. both
– Form of camera model
Camera Calibration – Example 1
• Given:
– 3D  2D correspondences
– General perspective camera model
(11-parameter, no radial distortion)
• Write equations:
ax1  by1  cz1  d
 u1
ix1  jy1  kz1  l
ex1  fy1  gz1  h
 v1
ix1  jy1  kz1  l

Camera Calibration – Example 1
 x1

0
x
 2
0
 

y1
0
y2
z1
0
z2
1
0
1
0
x1
0
0
y1
0
0
z1
0
0  u1 x1
1  u1 x1
0  u 2 x2
 u1 y1
 u1 y1
 u2 y2
 u1 z1
 u1 z1
 u2 z2
0

0

0 x2
 
y2

z2

1  u 2 x2


 u2 y2

 u2 z2

 u1  a 
 
 u1  b 




 u2 c  0
 
 u 2   
  l 
• Linear equation
• Overconstrained (more equations than unknowns)
• Underconstrained (rank deficient matrix – any multiple
of a solution, including 0, is
also a solution)
Camera Calibration – Example 1
• Standard linear least squares methods for
Ax=0 will give the solution x=0
• Instead, look for a solution with |x|= 1
• That is, minimize |Ax|2 subject to |x|2=1
Camera Calibration – Example 1
• Minimize |Ax|2 subject to |x|2=1
• |Ax|2 = (Ax)T(Ax) = (xTAT)(Ax) = xT(ATA)x
• Expand x in terms of eigenvectors of ATA:
x = m1e1+ m2e2+…
xT(ATA)x = l1m12+l2m22+…
|x|2 = m12+m22+…
Camera Calibration – Example 1
• To minimize
l1m12+l2m22+…
subject to
m12+m22+… = 1
set mmin= 1 and all other mi=0
• Thus, least squares solution is minimum (nonzero) eigenvalue of ATA
Camera Calibration – Example 2
• Incorporating additional constraints into camera
model
– No shear, no scale (rigid-body motion)
– Square pixels
– etc.
• These impose nonlinear constraints on camera
parameters
Camera Calibration – Example 2
• Option 1: solve for general perspective model,
then find closest solution that
satisfies constraints
• Option 2: nonlinear least squares
– Usually “gradient descent” techniques
– Common implementations available
(e.g. Matlab optimization toolbox)
Camera Calibration – Example 3
• Incorporating radial distortion
• Option 1:
– Find distortion first (straight lines in
calibration target)
– Warp image to eliminate distortion
– Run (simpler) perspective calibration
• Option 2: nonlinear least squares
Camera Calibration – Example 4
• What if 3D points are not known?
• Structure from motion problem!
• As we saw last time, can often be solved since #
of knowns > # of unknowns
Multi-Camera Geometry
• Epipolar geometry – relationship between
observed positions of points in multiple cameras
• Assume:
– 2 cameras
– Known intrinsics and extrinsics
Epipolar Geometry
P
p1
C1
p2
C2
Epipolar Geometry
P
p1
C1
l2
p2
C2
Epipolar Geometry
P
Epipolar line
l2
p1
p2
C1
C2
Epipoles
Epipolar Geometry
• Goal: derive equation for l2
• Observation: P, C1, C2 determine a plane
P
l2
p1
C1
p2
C2
Epipolar Geometry
• Work in coordinate frame of C1
• Normal of plane is T  Rp2, where T is relative
translation, R is relative rotation
P
l2
p1
C1
p2
C2
Epipolar Geometry
• p1 is perpendicular to this normal:
p1  (T  Rp2) = 0
P
l2
p1
C1
p2
C2
Epipolar Geometry
• Write cross product as matrix multiplication

T  x  T* x ,
 0

*
T   Tz
T
 y
P
 Tz
0
Tx
l2
p1
C1
p2
C2
Ty 

 Tx 
0 
Epipolar Geometry
• p1  T* R p2 = 0

p1T E p2 = 0
• E is the essential matrix
P
l2
p1
C1
p2
C2
Essential Matrix
• E depends only on camera geometry
• Given E, can derive equation for line l2
P
l2
p1
C1
p2
C2
Fundamental Matrix
• Can define fundamental matrix F analogously,
operating on pixel coordinates instead of
camera coordinates
u1 T F u2 = 0
• Advantage: can sometimes estimate F without
knowing camera calibration