Off-the-Shelf Vision-Based Mobile Robot Sensing Zhichao Chen Advisor: Dr. Stan Birchfield Clemson University Vision in Robotics • A robot has to perceive its surroundings in order to interact with it. • Vision is promising for several reasons: Non-contact (passive) measurement Low cost Low power Rich capturing ability Project Objectives Path following: Traverse a desired trajectory in both indoor and outdoor environments. 1. “Qualitative vision-based mobile robot navigation”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2006. 2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 25(3):749-754, June 2009. Person following: Follow a person in a cluttered indoor environment. “Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007 Door detection: Build a semantic map of the locations of doors as the robot drives down a corridor. “Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008, Motivation for Path Following • Goal: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments • Applications: courier, delivery, tour guide, scout robots • Previous approaches: • • • • • • • Image Jacobian [Burschka and Hager 2001] Homography [Sagues and Guerrero 2005] Homography (flat ground plane) [Liang and Pears 2002] Man-made environment [Guerrero and Sagues 2001] Calibrated camera [Atiya and Hager 1993] Stereo cameras [Shimizu and Sato 2000] Omni-directional cameras [Adorni et al. 2003] Our Approach to Path Following • Key intuition: Vastly overdetermined system (Dozens of feature points, one control decision) • Key result: Simple control algorithm – – – – Teach / replay approach using sparse feature points Single, off-the-shelf camera No calibration for camera or lens Easy to implement (no homographies or Jacobians) current image milestone image overview top-down view Preview of Results Tracking Feature Points Kanade-Lucas-Tomasi (KLT) feature tracker • Automatically selects features using eigenvalues of 2x2 gradient covariance matrix Z g(x )gT (x ) W • gradient of image Automatically tracks features by minimizing sum of squared differences (SSD) between consecutive image frames unknown displacement 2 d d I x J x W 2 2 dx gray-level images • Augmented with gain and bias to handle lighting changes 2 d d W I x 2 J (x 2 ) dx • Open-source implementation [http://www.ces.clemson.edu/~stb/klt] Teach-Replay track detect features features goal feature destination initial feature Teaching Phase start compare track features features Replay Phase current feature goal feature Qualitative Decision Rule Landmark image plane feature Robot at goal uGoal uCurrent funnel lane Feature is to the right |uCurrent| > |uGoal| “Turn right” No evidence “Go straight” Feature has changed sides sign(uCurrent) ≠ sign(uGoal) “Turn left” The Funnel Lane at an Angle Landmark α image plane feature Robot at goal α α funnel lane Feature is to the right “Turn right” No evidence “Go straight” Side change “Turn left” A Simplified Example Landmark feature Robot at goal funnelfunnel lane lane funnel lane “Go“Go “Go “Turn right” straight” straight” straight” funnel lane “Go straight” “Turn left” The funnel Lane Created by Multiple Feature Points Landmark #2 Landmark #1 Landmark #3 α α Feature is to the right “Turn right” No evidence “Do not turn” Side change “Turn left” Qualitative Control Algorithm Funnel constraints: u C uD and sign u sign u C uGoal D Desired heading min uC , φ(uC , uD ) θid max uC , φ(uC , uD ) 0 if(u if(u C 0 and uC uD C 0 and uC uD otherwise where φ is the signed distance between the uC and uD Incorporating Odometry Desired heading 1 N i θ d β θ d (1 β)θ o N i 1 Desired heading Desired heading from ith feature from odometry point N is the number of the features; β [0,1] Overcoming Practical Difficulties To deal with rough terrain: Prior to comparison, feature coordinates are warped to compensate for a non-zero roll angle about the optical axis by applying the RANSAC algorithm. To avoid obstacles: The robot detects and avoids an obstacle by sonar, and the odometry enables the robot to roughly return to the path. Then the robot converges to the path using both odometry and vision. current image milestone image overview top-down view Experimental Results Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot current image milestone image overview top-down view Experimental Results Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot Experimental Results: Rough Terrain Experimental Results: Avoiding an Obstacle Experimental Results Indoor Outdoor Imaging Source Firewire camera Logitech Pro 4000 webcam Project Objectives Path following: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments. 1. “Qualitative vision-based mobile robot navigation”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2006. 2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 2009 Person following: Enable a mobile robot to follow a person in a cluttered indoor environment by vision. “Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007 Door detection: Detect doors as the robot drives down a corridor. “Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008 Motivation • Goal: Enable a mobile robot to follow a person in a cluttered indoor environment by vision. • Previous approaches: • Appearance properties: color, edges. [Sidenbladh et al. 1999, Tarokh and Ferrari 2003, Kwon et al. 2005] Person has different color from background or faces camera. Lighting changes. • Optical flow. [Piaggio et al 1998, Chivilò et al. 2004] Drift as the person moves with out-of-plane rotation • Dense stereo and odometry. [Beymer and Konolige 2001] difficult to predict the movement of the robot (uneven surfaces, slippage in the wheels). Our approach • Algorithm: Sparse stereo based on Lucas-Kanade feature tracking. • Handles: • Dynamic backgrounds. • Out-of-plane rotation. • Similar disparity between the person and background. • Similar color between the person and background. System overview Detect 3D features of the scene ( Cont. ) • Features are selected in the left image IL and matched in the right image IR. The size of each square indicates the horizontal disparity of the feature. Left image Right image System overview Detecting Faces • The Viola-Jones frontal face detector is applied. • This detector is used both to initialize the system and to enhance robustness when the person is facing the camera. Note: The face detector is not necessary in our system. Overview of Removing Background 1) using the known disparity of the person in the previous image frame. 2) using the estimated motion of the background. 3) using the estimated motion of the person Remove Background Step 1: Using the known disparity ~ | d t d t | d Discard features for which ~ where d t is the known disparity of the person in the previous frame, and d t is the disparity of a feature at time t . Original features Foreground features in step 1 Background features Remove Background Step 2: Using background motion • Estimate the motion of the background by computing a 4 × 4 affine transformation matrix H between two image frames at times t and t + 1: f ti f ti1 H 1 1 (1) • Random sample consensus (RANSAC) algorithm is used to yield dominant motion. Foreground features with similar disparity in step 1 Foreground features after step 2 Remove Background Step 3: Using person motion • Similar to step 2, the motion model of the person is calculated. • The size of the person group should be the biggest. • The centroid of the person group should be proximate to the previous location of the person. Foreground features after step 2 Foreground features after step 3 System overview System overview Experimental Results Video Project Objectives Path following: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments. 1. “Qualitative vision-based mobile robot navigation”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2006. 2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 2009 Person following: Enable a mobile robot to follow a person in a cluttered indoor environment by vision. “Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007 Door detection: Detect doors as the robot drives down a corridor. “Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008 Motivation for Door Detection Metric map Topological map Either way, doors are semantically meaningful landmarks Previous Approaches to Detecting Doors Range-based approaches sonar [Stoeter et al.1995], stereo [Kim et al. 1994], laser [Anguelov et al. 2004] Vision-based approaches fuzzy logic neural network color segmentation [Munoz-Salinas et al. 2004] [Cicirelli et al 2003] [Rous et al. 2005] Limitations: • require different colors for doors and walls • simplified environment (untextured floor, no reflections) • limited viewing angle • high computational load • assume lintel (top part) visible What is Lintel-Occluded? Lintel-occluded post-and-lintel architecture camera is low to ground cannot point upward b/c obstacles lintel post Our Approach N ( x ) sign n hn ( x ) , n 1 n N ( x ) sign n hn ( x ) , n 1 n Assumptions: • Both door posts are visible • Posts appear nearly vertical • The door is at least a certain width Key idea: Multiple cues are necessary for robustness (pose, lighting, …) Video Pairs of Vertical Lines vertical lines Canny edges detected lines non-vertical lines 1. 2. 3. 4. 5. Edges detected by Canny Line segments detected by modified Douglas-Peucker algorithm Clean up (merge lines across small gaps, discard short lines) Separate vertical and non-vertical lines Door candidates given by all the vertical line pairs whose spacing is within a given range Homography In the image In the world (x’, y’) (x, y) x' x h11 h12 y' H y h21 h22 1 1 h 31 h32 h13 x h23 y h33 1 Prior Model Features: Width and Height Principal point x 3.0 y ? ? (0,y) (0,0) H T CH 1 (x, y) (x,0) An Example As the door turns, the bottom corner traces an ellipse (projective transformation of circle is ellipse) But not horizontal Data Model (Posterior) Features Image gradient along edges (g1) texture (g5) Placement of top and bottom edges (g2 , g3) Kick plate (g6) Color (g4) Vanishing point (g7) and two more… Data Model Features (cont.) Intensity along the line darker (light off) positive brighter (light on) negative no gap Bottom gap(g8) Data Model Features (cont.) Slim “U” vertical door lines door wall wall Lleft bottom door edge intersection line of wall and floor extension of intersection line Concavity(g9) LRight ε floor Two Methods to Detect Doors Training images Adaboost Weights of features The strong classifier ψ ( x ) sgn n hn ( x ) 0, n 1 where hn is the hard decison N for each weak classifier, hn { 1,1} Weights of features Bayesian formulation E (d ) data ( I | d ) prior (d ) (yields better results) Bayesian Formulation image door p(d | I ) p( I | d ) p(d ) Taking the log likelihood, E (d ) data ( I | d ) prior (d ) Data model data (d, I ) Prior model N data i 1 i f i (d ) prior (d ) where f i (d) and g j (d ) [0,1] N prio r j 1 j g j (d ) MCMC and DDMCMC • Markov Chain Monte Carlo (MCMC) is used here to maximize probability to detect door (like random walk through state space of doors) • Data driven MCMC (DDMCMC) is used to speed up computation doors appear more frequently at the position close to the vertical lines the top of the door is often occluded or a horizontal line closest to the top the bottom of the door is often close to the wall/floor boundary. Experimental Results: Similar or Different Door/Wall Color Experimental Results: High Reflection / Textured Floors Experimental Results: Different Viewpoints Experimental Results: Cluttered Environments Results 25 different buildings 600 images: • 100 training • 500 testing 91.1% accuracy with 0.09 FP per image Speed: 5 fps 1.6GHz (unoptimized) False Negatives and Positives strong reflection concavity and bottom gap tests fail distracting reflection two vertical lines unavailable concavity erroneously detected distracting reflection Navigation in a Corridor • Doors were detected and tracked from frame to frame. • Fasle positives are discarded if doors were not repeatedly detected. Video Conclusion • Path following • • • Teach-replay, comparing image coordinates of feature points (no calibration) Qualitative decision rule (no Jacobians, homographies) Person following • • • • Detects and matches feature points between a stereo pair of images and between successive images. RANSAC-based procedure to estimate the motion of each region Does not require the person to wear a different color from the background. Door detection • • Integrate a variety of features of door Adaboost training and DDMCMC. Future Work • Path following • Incorporating higher-level scene knowledge to enable obstacle avoidance and terrain characterization Connecting multiple teaching paths in a graph-based framework to enable autonomous navigation between arbitrary points. Person following Fusing the information with additional appearance-based information ( template or edges) . Integration with EM tracking algorithm. • Door detection Calibrate the camera to enable pose and distance measurements to facilitate the building of a geometric map. Integrated into a complete navigation system that is able to drive down a corridor and turn into a specified room