Qualitative Vision-Based Mobile Robot Navigation

advertisement
Off-the-Shelf Vision-Based
Mobile Robot Sensing
Zhichao Chen
Advisor: Dr. Stan Birchfield
Clemson University
Vision in Robotics
• A robot has to perceive its surroundings in
order to interact with it.
• Vision is promising for several reasons:
 Non-contact (passive) measurement
 Low cost
 Low power
 Rich capturing ability
Project Objectives
Path following:
Traverse a desired trajectory in
both indoor and outdoor environments.
1. “Qualitative vision-based mobile robot navigation”, Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA), 2006.
2. “Qualitative vision-based path following”, IEEE Transactions on
Robotics, 25(3):749-754, June 2009.
Person following: Follow a person in a
cluttered indoor environment.
“Person Following with a Mobile Robot Using Binocular Feature-Based
tracking”, Proceedings of the IEEE International Conference on
Intelligent Robots and Systems (IROS), 2007
Door detection: Build a semantic map of the
locations of doors as the robot drives down a
corridor.
“Visual detection of lintel-occluded doors from a single camera”, IEEE
Computer Society Workshop on Visual Localization for Mobile Platforms
(in association with CVPR),2008,
Motivation for Path Following
• Goal: Enable mobile robot to follow a desired
trajectory in both indoor and outdoor environments
• Applications: courier, delivery, tour guide, scout
robots
• Previous approaches:
•
•
•
•
•
•
•
Image Jacobian [Burschka and Hager 2001]
Homography [Sagues and Guerrero 2005]
Homography (flat ground plane) [Liang and Pears 2002]
Man-made environment [Guerrero and Sagues 2001]
Calibrated camera [Atiya and Hager 1993]
Stereo cameras [Shimizu and Sato 2000]
Omni-directional cameras [Adorni et al. 2003]
Our Approach to Path Following
• Key intuition: Vastly overdetermined system
(Dozens of feature points, one control decision)
• Key result: Simple control algorithm
–
–
–
–
Teach / replay approach using sparse feature points
Single, off-the-shelf camera
No calibration for camera or lens
Easy to implement (no homographies or Jacobians)
current image
milestone image
overview
top-down view
Preview of Results
Tracking Feature Points
Kanade-Lucas-Tomasi (KLT) feature tracker
• Automatically selects features using eigenvalues of 2x2 gradient
covariance matrix
Z   g(x )gT (x )
W
•
gradient of image
Automatically tracks features by minimizing sum of squared
differences (SSD) between consecutive image frames
unknown displacement
2
 
d 
d 
I
x


J
x




W   2   2  dx
gray-level images
•
Augmented with gain and bias to handle lighting changes
2
 
d 
d

W  I  x  2   J (x  2 )    dx
•
Open-source implementation
[http://www.ces.clemson.edu/~stb/klt]
Teach-Replay
track
detect features
features
goal feature
destination
initial feature
Teaching
Phase
start
compare
track
features features
Replay
Phase
current feature
goal feature
Qualitative Decision Rule
Landmark
image plane
feature
Robot at goal
uGoal
uCurrent
funnel lane
Feature is to the right
|uCurrent| > |uGoal|
 “Turn right”
No evidence
“Go straight”
Feature has changed sides
sign(uCurrent) ≠ sign(uGoal)
 “Turn left”
The Funnel Lane at an Angle
Landmark
α
image plane
feature
Robot at goal
α
α
funnel lane
Feature is to the right
 “Turn right”
No evidence
“Go straight”
Side change
 “Turn left”
A Simplified Example
Landmark
feature
Robot at goal
funnelfunnel
lane lane funnel lane
“Go“Go
“Go
“Turn right”
straight”
straight” straight”
funnel lane
“Go straight”
“Turn left”
The funnel Lane Created by
Multiple Feature Points
Landmark #2
Landmark #1
Landmark #3
α
α
Feature is to the right
 “Turn right”
No evidence
“Do not turn”
Side change
 “Turn left”
Qualitative Control Algorithm
Funnel constraints:
u C  uD
and
   sign u 
sign u
C
uGoal
D
Desired heading




 min uC , φ(uC , uD )

θid   max uC , φ(uC , uD )

0

if(u
if(u
C
 0 and uC  uD
C
 0 and uC  uD
otherwise
where φ is the signed distance between the uC and uD
Incorporating Odometry
Desired heading
1 N i
θ d  β  θ d  (1  β)θ o
N i 1
Desired heading
Desired heading
from ith feature
from odometry
point
N is the number of the features;
β  [0,1]
Overcoming Practical Difficulties
To deal with rough terrain:
Prior to comparison, feature coordinates are warped to
compensate for a non-zero roll angle about the optical axis by
applying the RANSAC algorithm.
To avoid obstacles:
The robot detects and avoids an obstacle by sonar, and the
odometry enables the robot to roughly return to the path.
Then the robot converges to the path using both odometry
and vision.
current image
milestone image
overview
top-down view
Experimental Results
Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot
current image
milestone image
overview
top-down view
Experimental Results
Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot
Experimental Results:
Rough Terrain
Experimental Results:
Avoiding an Obstacle
Experimental Results
Indoor
Outdoor
Imaging Source Firewire camera
Logitech Pro 4000 webcam
Project Objectives
Path following:
Enable mobile robot to follow a
desired trajectory in both indoor and outdoor
environments.
1. “Qualitative vision-based mobile robot navigation”, Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA), 2006.
2. “Qualitative vision-based path following”, IEEE Transactions on
Robotics, 2009
Person following: Enable a mobile robot to
follow a person in a cluttered indoor environment
by vision.
“Person Following with a Mobile Robot Using Binocular Feature-Based
tracking”, Proceedings of the IEEE International Conference on
Intelligent Robots and Systems (IROS), 2007
Door detection: Detect doors as the robot
drives down a corridor.
“Visual detection of lintel-occluded doors from a single camera”, IEEE
Computer Society Workshop on Visual Localization for Mobile Platforms
(in association with CVPR),2008
Motivation
• Goal: Enable a mobile robot to follow a person in a cluttered
indoor environment by vision.
• Previous approaches:
• Appearance properties: color, edges.
[Sidenbladh et al. 1999, Tarokh and Ferrari 2003, Kwon et al. 2005]
 Person has different color from background or faces camera.
 Lighting changes.
• Optical flow.
[Piaggio et al 1998, Chivilò et al. 2004]
 Drift as the person moves with out-of-plane rotation
• Dense stereo and odometry. [Beymer and Konolige 2001]
 difficult to predict the movement of the robot (uneven surfaces,
slippage in the wheels).
Our approach
• Algorithm: Sparse stereo based on
Lucas-Kanade feature tracking.
• Handles:
• Dynamic backgrounds.
• Out-of-plane rotation.
• Similar disparity between the person and
background.
• Similar color between the person and
background.
System overview
Detect 3D features of the scene
( Cont. )
• Features are selected in the left image IL
and matched in the right image IR.
The size of each square indicates the
horizontal disparity of the feature.
Left image
Right image
System overview
Detecting Faces
• The Viola-Jones frontal face detector is applied.
• This detector is used both to initialize the system
and to enhance robustness when the person is
facing the camera.
Note: The face detector is not necessary in our system.
Overview of Removing Background
1) using the known disparity
of the person in the
previous image frame.
2) using the estimated motion
of the background.
3) using the estimated
motion of the person
Remove Background
Step 1: Using the known disparity
~
| d t  d t |  d
Discard
features
for
which
~
where d t is the known disparity of the person in the previous frame,
and d t is the disparity of a feature at time t .
Original features
Foreground features in step 1
Background features
Remove Background
Step 2: Using background motion
• Estimate the motion of the background by computing a
4 × 4 affine transformation matrix H between two image
frames at times t and t + 1:
f ti 
f ti1 
   H 
1
1
(1)
• Random sample consensus (RANSAC) algorithm is
used to yield dominant motion.
Foreground features with similar
disparity in step 1
Foreground features after step 2
Remove Background
Step 3: Using person motion
• Similar to step 2, the motion model of the person is
calculated.
• The size of the person group should be the biggest.
• The centroid of the person group should be proximate to
the previous location of the person.
Foreground features after step 2
Foreground features after step 3
System overview
System overview
Experimental Results
Video
Project Objectives
Path following:
Enable mobile robot to follow a
desired trajectory in both indoor and outdoor
environments.
1. “Qualitative vision-based mobile robot navigation”, Proceedings of the
IEEE International Conference on Robotics and Automation (ICRA), 2006.
2. “Qualitative vision-based path following”, IEEE Transactions on
Robotics, 2009
Person following: Enable a mobile robot to
follow a person in a cluttered indoor environment
by vision.
“Person Following with a Mobile Robot Using Binocular Feature-Based
tracking”, Proceedings of the IEEE International Conference on
Intelligent Robots and Systems (IROS), 2007
Door detection: Detect doors as the robot
drives down a corridor.
“Visual detection of lintel-occluded doors from a single camera”, IEEE
Computer Society Workshop on Visual Localization for Mobile Platforms
(in association with CVPR),2008
Motivation for Door Detection
Metric map
Topological map
Either way, doors are
semantically meaningful
landmarks
Previous Approaches to
Detecting Doors
Range-based approaches
sonar [Stoeter et al.1995], stereo [Kim et al. 1994], laser [Anguelov et al. 2004]
Vision-based approaches
fuzzy logic
neural network
color segmentation
[Munoz-Salinas et al. 2004]
[Cicirelli et al 2003]
[Rous et al. 2005]
Limitations:
• require different colors for doors and walls
• simplified environment (untextured floor, no reflections)
• limited viewing angle
• high computational load
• assume lintel (top part) visible
What is Lintel-Occluded?
Lintel-occluded
 post-and-lintel architecture
 camera is low to ground
 cannot point upward b/c obstacles
lintel
post
Our Approach
 N

( x )  sign   n hn ( x ) ,
 n 1

n
 N

( x )  sign   n hn ( x ) ,
 n 1

n
Assumptions:
• Both door posts are visible
• Posts appear nearly vertical
• The door is at least a certain
width
Key idea: Multiple cues are necessary for robustness (pose, lighting, …)
Video
Pairs of Vertical Lines
vertical lines
Canny edges
detected lines
non-vertical lines
1.
2.
3.
4.
5.
Edges detected by Canny
Line segments detected by modified Douglas-Peucker algorithm
Clean up (merge lines across small gaps, discard short lines)
Separate vertical and non-vertical lines
Door candidates given by all the vertical line pairs whose spacing
is within a given range
Homography
In the image
In the world
(x’, y’)
(x, y)
 x' 
 x   h11 h12
 
  
  y'   H  y    h21 h22
1
 1 h
 
   31 h32
h13  x 
 
h23  y 
h33  1 
Prior Model Features:
Width and Height
Principal point
x
 3.0
y
?
?
(0,y)
(0,0)
H T CH 1
(x, y)
(x,0)
An Example
As the door turns, the bottom corner traces an ellipse
(projective transformation of circle is ellipse)
But not horizontal
Data Model (Posterior) Features
Image gradient along edges (g1)
texture (g5)
Placement of top and
bottom edges (g2 , g3)
Kick plate (g6)
Color (g4)
Vanishing point (g7)
and two more…
Data Model Features (cont.)
Intensity along the line
darker
(light off)
positive
brighter
(light on)
negative
no gap
Bottom gap(g8)
Data Model Features (cont.)
Slim “U”
vertical door lines
door
wall
wall
Lleft
bottom door edge
intersection line
of wall and floor
extension of
intersection line
Concavity(g9)
LRight
ε
floor
Two Methods to Detect Doors
Training images
Adaboost
Weights of
features
The strong classifier


ψ ( x )  sgn   n hn ( x )   0,
 n 1

where hn is the hard decison
N
for each weak classifier, hn  { 1,1}
Weights of
features
Bayesian formulation
E (d )  data ( I | d )  prior (d )
(yields better results)
Bayesian Formulation
image
door
p(d | I )  p( I | d ) p(d )
Taking the log likelihood,
E (d )  data ( I | d )  prior (d )
Data model
data (d, I ) 
Prior model
N data

i 1
i
f i (d )
prior (d ) 
where f i (d) and g j (d )  [0,1]
N prio r

j 1
j
g j (d )
MCMC and DDMCMC
• Markov Chain Monte Carlo (MCMC) is used here to maximize
probability to detect door (like random walk through state space of
doors)
• Data driven MCMC (DDMCMC) is used to speed up computation
 doors appear more frequently at the position close to the vertical lines
 the top of the door is often occluded or a horizontal line closest to the top
 the bottom of the door is often close to the wall/floor boundary.
Experimental Results: Similar
or Different Door/Wall Color
Experimental Results: High
Reflection / Textured Floors
Experimental Results:
Different Viewpoints
Experimental Results:
Cluttered Environments
Results
25 different buildings
600 images:
• 100 training
• 500 testing
91.1% accuracy with
0.09 FP per image
Speed: 5 fps 1.6GHz
(unoptimized)
False Negatives and Positives
strong
reflection
concavity
and bottom
gap tests fail
distracting reflection
two vertical lines
unavailable
concavity
erroneously
detected
distracting reflection
Navigation in a Corridor
• Doors were detected and tracked from frame to frame.
• Fasle positives are discarded if doors were not
repeatedly detected.
Video
Conclusion
•
Path following
•
•
•
Teach-replay, comparing image coordinates of feature points (no
calibration)
Qualitative decision rule (no Jacobians, homographies)
Person following
•
•
•
•
Detects and matches feature points between a stereo pair of
images and between successive images.
RANSAC-based procedure to estimate the motion of each region
Does not require the person to wear a different color from the
background.
Door detection
•
•
Integrate a variety of features of door
Adaboost training and DDMCMC.
Future Work
•
Path following


•
Incorporating higher-level scene knowledge to enable
obstacle avoidance and terrain characterization
Connecting multiple teaching paths in a graph-based
framework to enable autonomous navigation between
arbitrary points.
Person following

Fusing the information with additional appearance-based
information ( template or edges) .

Integration with EM tracking algorithm.
•
Door detection


Calibrate the camera to enable pose and distance measurements
to facilitate the building of a geometric map.
Integrated into a complete navigation system that is able to drive
down a corridor and turn into a specified room
Download