Announcements

advertisement
Announcements
Structure-from-Motion
• Determining the 3-D structure of the world, and/or the
motion of a camera using a sequence of images
taken by a moving camera.
– Equivalently, we can think of the world as moving and the
camera as fixed.
• Like stereo, but the position of the camera isn’t
known (and it’s more natural to use many images
with little motion between them, not just two with a lot
of motion).
– We may or may not assume we know the parameters of the
camera, such as its focal length.
Structure-from-Motion
• As with stereo, we can divide problem:
– Correspondence.
– Reconstruction.
• Again, we’ll talk about reconstruction
first.
– So for the next few classes we assume that
each image contains some points, and we
know which points match which.
Structure-from-Motion
…
Movie
Reconstruction
• A lot harder than with stereo.
• Start with simpler case: scaled
orthographic projection (weak
perspective).
– Recall, in this we remove the z coordinate
and scale all x and y coordinates the same
amount.
First: Represent motion
• We’ll talk about a fixed camera, and moving object.
• Key point:
Some matrix
Points
r r r t 

S  
xn 
 x1 x2


r r r t 
 y1
P
z
 1
1

y2
z2
1
.
.
.
yn 
zn 

1 
1, 2
1, 3
x
2 ,1
2,2
2,3
y
The image
X X . . .
I 
Y Y
1
2
I  SP
1
Then:
1,1
2
X 

Y 
n
n
Structure-from-Motion
• S encodes:
– Projection: only two lines
– Scaling, since S can have a scale factor.
– Translation, by tx and ty.
– Rotation:
I  SP
Rotation
 r1,1

 r2,1
r
 3,1
r1, 2
r2, 2
r3, 2
r1,3 

r2,3  P

r3,3 
Represents a
3D rotation of
the points in P.
First, look at 2D rotation
(easier)
Matrix R acts
on points by
rotating them.
 cos

  sin 
 cos 
R  
  sin 
sin   x1

cos  y1
x2
y2
.
sin  

cos  
.
.
xn 

yn 
• Also, RRT = Identity. RT is also a rotation
matrix, in the opposite direction to R.
Simple 3D Rotation
 cos

  sin 
 0

sin 
cos
0
0  x

0  y


1  z
1
1
1
x
y
z
2
.
.
.
2
2
Rotation about z axis.
Rotates x,y coordinates. Leaves z coordinates fixed.
x
y
z
n
n
n





Full 3D Rotation
 cos

R    sin 
 0

sin 
cos
0
0  cos 

0  0
1   sin 
0 sin   1
0

1
0  0 cos
0 cos   0  sin 
0 

sin  
cos 
• Any rotation can be expressed as combination of three
rotations about three axes.
 1 0 0


RR   0 1 0 
 0 0 1


T
• Rows (and columns) of R are
orthonormal vectors.
• R has determinant 1 (not -1).
Putting it Together
3D Translation  r
Scale
 1 0 0 t 
 r
 1 0 0 
s
 0 1 0 t 
r
 0 1 0 

 0 0 1 t 
Projection
0
1,1
x
2 ,1
y
3 ,1
z
r
r
r
0
1, 2
2,2
3, 2
r
r
r
0
1, 3
2,3
3,3
0

0
P
0

1
3D Rotation
s
s
 
s
s
where
1,1
1, 2
s
2 ,1
2,2
s
1, 3
2,3
st 
 P
st 
x
y
(s , s , s )  (s , s , s )  0
1,1
1, 2
1, 3
2 ,1
2,2
2,3
(s , s , s )  (s , s , s )
1,1
1, 2
1, 3
2 ,1
2,2
2,3
We can just write stx as
tx and sty as ty.
Affine Structure from Motion
s
s

s
s
where
1,1
1, 2
s
2 ,1
2,2
s
t
 P
t
1, 3
x
2,3
y
(s , s , s )  (s , s , s )  0
1,1
1, 2
1, 3
2 ,1
2,2
2,3
(s , s , s )  (s , s , s )
1,1
1, 2
1, 3
2 ,1
2,2
2,3
Affine Structure-from-Motion: Two
Frames (1)
u

v
u

v
1
1
1
1
2
1
2
1
u
v
u
v
1
2
1
2
2
2
2
2
.
.
.
u
v
u
v
1
n
1
n
2
n
2
n
 s
 
 s
 s
 
 s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t
1
x
1
y
2
x
2
y
 x

 y
 z

 1
1
1
1
x
y
z
1
2
2
2
.
.
.
x

y
z 

1
n
n
n
Affine Structure-from-Motion: Two
Frames (2)
x

y
z

1
1
To make things
easy, suppose:
1
1
x
y
z
1
2
2
2
x
y
z
1
3
3
3
x  0
 
y  0

z  0
 
1  1
4
4
4
1
0
0
1
0
1
0
1
0

0
1

1
Affine Structure-from-Motion: Two
Frames (3)
u

v
u

v
u
v
u
v
1
1
1
1
.
1
2
.
.
u
v
u
v
1
2
2
1
2
1
1
n
1
n
2
2
2
2
2
n
2
n
 s
 
 s
 s
 
 s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t
 x

 y
 z

 1
1
x
1
1
y
1
2
x
1
2
y
x
y
z
1
.
1
0
0
1
0
1
0
1
2
.
2
1
1
1
1
2
1
2
1
u
v
u
v
1
2
1
2
2
2
2
2
u
v
u
v
1
3
1
3
2
3
2
3
u
v
u
v
1
4
1
4
2
4
2
4
 s
 
 s
 s
 
 s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t
1
x
1
y
2
x
2
y
 0

 0
 0

 1
x

y
z 

1
n
n
2
n
Looking at the first four points, we get:
u

v
u

v
.
0

0
1

1
Affine Structure-from-Motion: Two
Frames (7)
But, what if the first four points aren’t so simple?
Then we define A so that:
x

y

A
z

1
1
1
1
x
y
z
1
2
2
2
x
y
z
1
3
3
3
x  0
 
y  0

z  0
 
1  1
4
4
4
1
0
0
1
0
1
0
1
0

0
1

1
This is always possible as long as the points
aren’t coplanar.
Affine Structure-from-Motion: Two
Frames (8)
u

v
u

v
Then,
given:
1
1
1
1
2
2
1
We have:
u
v
u
v
1
1
1
1
2
2
1
1
1
And:
1
1
2
1
2
1
.
2
2
2
2
.
1
2
1
2
.
.
u
v
u
v
1
2
2
2
2
2
1
n
1
n
2
2
2
2
n
2
n
.
1
n
1
2
n
2
n
1
.
u
v
u
v
n
2
u
v
u
v
.
1
2
1
u

v
u

v
.
1
2
2
1
u

v
u

v
u
v
u
v
.
u
v
u
v
1
n
1
n
2
n
2
n
 s
 
 s
 s
 
 s
s
s
s
s
2 ,1
2,2
2
1,1
1, 2
2
2 ,1
1
1
2 ,1
2
1,1
2
2 ,1
1
1, 2
1
2,2
2
1, 2
2
2,2
1
2
1, 3
2
2,2
s
s
s
s
t
t
t
t
1
2,3
2
2
2,3
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t

x



y
A
A

z


1

x
y
z
1
.
1
2
2
2,3
1, 3
1
.
1, 3
2
s
s
s
s
1
x
y
z
1
2,3
2
2,2
1, 2
1
1
1, 2
2
 x

 y
 z

 1
1
1, 3
2,2
2
2 ,1
1,1
1,1
1
1,1
1
s
s
s
s
1
1, 2
2 ,1
 s
 
 s
 s
 
 s
 s
 
 s
 s
 
 s
s
s
s
s
1
1,1
1
x
1
y
2
x
2
y
1
x
y
2
x
1
y
2
x
2
y
1
2
1
1
1
1
0

0
0

1
.
2
.
.
x

y
z 

1
n
n
2
0
0
1
1
n
n
2
0
1
0
1
x

y
z 

1
n
2
1
1
0
0
1
.
2
1
2



A


1
x
1
1
y
t
t
t
t
n
.
.
.
x

y
z 

1
n
n
n
Affine Structure-from-Motion: Two
Frames (9)
u

v
u

v
1
1
Given:
1
1
2
1
2
1
u
v
u
v
.
1
2
.
.
u
v
u
v
1
2
1
n
1
n
2
2
2
2
2
n
2
n
 s
 
 s
 s
 
 s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
s
s
s
s
1
1, 2
1
1, 3
1
2,2
1
2,3
2
1, 2
2
1, 3
2
2,2
2
2,3
t
t
t
t
1
x
1
y
2
x
2
y



A


1
0

0
0

1
1
0
0
1
0
1
0
1
0
0
1
1
.
Then we just pretend that:
s

s
s

s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t
1
x
1
y
2
x
2
y



A



1
is our motion,
and solve as
before.
.
.
x

y
z 

1
n
n
n
Affine Structure-from-Motion: Two
Frames (10)
This means that we can never determine the exact 3D
structure of the scene. We can only determine it up to
some transformation, A. Since if a structure and motion
explains the points:
u

v
u

v
1
1
1
1
2
1
2
1
u
v
u
v
1
2
.
.
.
u
v
u
v
1
2
n
1
n
2
2
2
2
So does
another of
the form:
1
2
n
2
n
u

v
u

v
1
1
1
1
2
1
2
1
u
v
u
v
1
2
1
2
2
2
2
2
 s
 
 s
 s
 
 s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
.
.
.
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
u
v
u
v
1
n
1
n
2
n
2
n
s
s
s
s
t
t
t
t
1
1, 3
1
2,3
2
1, 3
2
2,3
  s
 
  s
   s
  
  s
1
1,1
1
2 ,1
2
1,1
2
2 ,1
1
x
1
y
2
x
2
y
 x

 y
 z

 1
x
y
z
1
1
1
1
1, 2
1
2,2
2
1, 2
2
2,2
.
x

y
z 

1
.
n
2
1
s
s
s
s
.
2
n
2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
n
t
t
t
t
1
x
1
y
2
x
2
y



A


  x
 
  y
 A z
 

  1
1
1
1
1
x
y
z
1
2
2
2
.
.
.
x 

y 
z 

1  
n
n
n
Affine Structure-from-Motion: Two
Frames (11)
u

v
u

v
1
1
1
1
2
1
2
1
u
v
u
v
1
2
1
2
2
2
2
2
.
.
.
u
v
u
v
1
n
1
n
2
n
2
n
  s
 
  s
   s
  
  s
Note that A has
the form:
A corresponds to
translation of the
points, plus a
linear
transformation.
1
1,1
1
2 ,1
2
1,1
2
2 ,1
s
s
s
s
1
1, 2
1
2,2
2
1, 2
2
2,2
s
s
s
s
1
1, 3
1
2,3
2
1, 3
2
2,3
t
t
t
t
1
x
1
y
2
x
2
y



A


a

a

A
a

0
1,1
2 ,1
3 ,1
  x
 
  y
 A z
 

  1
1
1
1
1
a
a
a
0
1, 2
2,2
3, 2
x
y
z
1
2
.
.
x 

y 
z 
 
1 
.
n
2
n
2
n
a
a
a
0
1, 3
2,3
3,3
a
a
a
1
1, 4
2,4
3, 4






Download