Multiple Relative Pose Graphs for Cooperative Mapping by Been Kim Submitted to the Department of Aeronautics and Astronautics in partial fulfillment of the requirements for the degree of MASSACHUSETTS INSTI")TE OF TECHNOLOGY Master of Science in Aeronautics and Astronautics JUN 2 3 2010 at the LIBRARIES MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2010 © Massachusetts Institute of Technology 2010. All rights reserved. ARCHNES Author...................................................... Department of Aeronautics and Astronautics June 4, 2010 A Certified by............ ....................... Prof. John Leonard Professor of Mechanical and Ocean Engineering Thesis Supervisor A I A .................. Prof. D vd L.Darmofal Chairman, Committee on diaduate Theses A ccepted by ......................... Multiple Relative Pose Graphs for Cooperative Mapping by Been Kim Submitted to the Department of Aeronautics and Astronautics on June 4, 2010, in partial fulfillment of the requirements for the degree of Master of Science in Aeronautics and Astronautics Abstract This thesis describes a new representation and algorithm for cooperative and persistent simultaneous localization and mapping (SLAM) using multiple robots. Recent pose graph representations have proven very successful for single robot mapping and localization. Among these methods, iSAM (incremental smoothing and mapping) gives an exact incremental solution to the SLAM problem by solving a full nonlinear optimization problem in real-time. In this paper, we present a novel extension to iSAM to facilitate multi-robot mapping based on multiple pose graphs. Our main contribution is a relative formulation of the relationship between multiple pose graphs. Our formulation avoids the initialization problem and leads to an efficient solution when compared to a completely global solution. Efficient access to covariances at any time for relative parameters is also provided, facilitating data association and loop closing. Each individual pose graph still uses a global parameterization, so that the overall system provides a globally consistent multi-robot solution. The performance of the technique is illustrated on a publicly available multi-robot data set as well as other data including a helicopter-ground robot combination. Thesis Supervisor: Prof. John Leonard Title: Professor of Mechanical and Ocean Engineering Acknowledgments First of all, I would like to thank my advisor Prof. John Leonard for his great advices, brilliant ideas, insights and warm support throughout my master's program. I was able to courageously move forward at challenging moments at MIT, because John was there to support me. I learned tremendously from him, and what I learned will be with me throughout my life. It was an honor to work with him, and his inspiration has been invaluable. I would also like to thank our marine robotics group - Michael Kaess, Maurice, Hordur, Aisha, Jacques, John McDonald, Rob, Georgios, Hunter and Michael Benjamin for their help and support. Special thanks to Michael Kaess who was always kind and patient in answering my numerous questions, gave me great advices, and contributed greatly to the work presented in this thesis and proofreading it. Also thanks to Maurice and Hordur for providing their broad engineering knowledge to our group. It was great to learn from you. I would also like to thank Prof. Seth Teller for his support from the very beginning at MIT. His advices, as well as his belief in me, have meant a lot to me. His mentorship always reminded me of the reason why I choose to be here, and not anywhere else, and the reason to love this place. The MIT robotics group people - Luke, Albert, Aisha, Matt, Andrew, Sachi, Alex Bahr, Olivier, Jun-geun, Soo-ho, Eduardo, Rick and my previous officemate Ed and David, helped me adjust to research at MIT with warm guidance. Special thanks to Luke Fletcher for his support and teaching me with his talented engineering skills and deep knowledge. He showed me how things can be beautifully done when it is done by the best engineer. I also thank Matt, Abe, Vishnue and Brandon for their work on the project with me. For the work presented in this thesis, I thank our collaborators - Michael Kaess, Luke, Prof. John Leonard, Abraham, Prof. Nicholas Roy and Prof. Seth Teller. This work would not have been possible without their brilliance, great advice, vision and suggestions. It was a great opportunity and an honor to work with them on this topic. I also thank to my sponsors. This work was partially supported by the Ford-MIT Alliance, the MIT CSAIL Agile Robotics for Logistics Projects, and by ONR grants N0001405-1-0244, N00014-06-1-0043 and N00014-07-1-0749. Han-Lim Choi, Soon-Min Bae and Yun Lee provided great advices and support. Being mentored by one of the finest engineers from my country was one of the greatest things being at MIT. Thanks to people from Aero-Astro Office, Barbara, Marie and Beth for their help. I am amazed about their efficiency and speed on their job. Special thanks to Barbara for her warmest support and true concerns of the well-being of graduate students. I wish her the best of luck in the future. Reading a draft of a thesis is not exactly how graduate students would like to spend their spare time. Even so, James, Hiro, Britt, Amy and Rob were patient and considerate enough to proof-read this thesis. I hope they now know how wonderful robotics research is by reading this thesis line by line. I am happy to have you all to blame about every error this thesis may have. Thank you so much for your time and effort. Being a GA3 international student chair was another one of the greatest things happen to me at MIT. Working with talented leaders from all over the world and collaborating with the department head, Prof. Ian Waitz and Prof. David Darmofal, has significantly helped me to develop better leadership and collaboration skills. The skills that I learned will be valuable throughout my career. You have waited long, my friends. I want to thank my very wonderful friends that I have been lucky enough to meet at MIT: Britt, James, Francois, Hiro, Amy, Francesco, Chiara, Bjoern, Andreas, Sunny, Elza, Caley, Fabio, Josh, Brent, Alex chan, Jacqueline, Ying, Yale, Yang, James Cowling, Eugene, Leo, Ardemis, Sameera, Brandon, Namiko, Shinji, Nanako, Lynn, Ai and Marty for their endless entertainment and laughs that always brightened my days. All are brilliant people with a great sense of humor. Also my great friends in Korea, Daun, Yae-Lin, Tae-woon, Yun-jung, A-young, who all care so much about me that they become upset about not getting updates from me often enough. I miss you all. A big special thanks to my very best friend Britt Bille Rasmussen. I do not know where to start to thank you. I cannot imagine my life here without your silliness, oddness, creative way of showing considerateness, caring, kindness and craziness. Times that we spent together, trips that we took, small moments that we were being silly together are all memories that I will treasure for the rest of my life. And yes, we will keep making more of this forever. My family also deserves many thanks. They gave me this opportunity to study at a wonderful place with fantastic people. Without my father's support and trust, my mother's love and caring, I would not be who I am or where I am. My sister's warmest caring and love will not be replaced with anything else. This abnormally long acknowledgment shows how lucky I am. Thank you. Contents 11 1 Introduction 2 . . . . . . . . . . . . ... 11 1.1 Perception with cooperative mobile robots . 1.2 The localization and mapping problem... . . . . . . . . . . . . . . . . 13 1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 16 Previous research 2.1 2.2 Formulation of the SLAM problem. . . . . . . . . . . . . . . . . 16 17 2.1.1 Preliminaries... . . . . . . . . . . . 2.1.2 The localization problem........ . . 2.1.3 The mapping problem 2.1.4 Combining the localization and mapping problems: SLAM . . . . . 21 . . . . . . . . . . ... 19 . . . . . . . . . . . . . . . . . . . . . . . . 20 . . . . . . . 25 Pose graph methods: smoothing and mapping... . . . . . 2.2.1 Smoothing methods for the SLAM problem . . . . . . . . . . . . . 26 2.2.2 Square root smoothing and mapping (square root SAM) 2.2.3 Efficient large-scale smoothing and mapping using submaps . . . . 30 2.3 Incremental smoothing and mapping.. . . 2.4 Cooperative SLAM.. . . . . . 2.5 . . . . . . . . . . . . . . ... . . . . . . 27 . . . . . . . . . . . . . ... 30 . . . . . . . . . . . . . . . . . . . . . . 35 2.4.1 Cooperative robots for reducing odometry error . . . . . . . . . . . 35 2.4.2 Cooperative SLAM using Kalman filter 2.4.3 Cooperative SLAM using Monte Carlo methods 2.4.4 Information form SLAM......... . . . . . . . . . . . . . . 37 2.4.5 Smoothing and mapping approaches. . . . . . . . . . . . . . . 37 Alternative approaches... . . . . . . . . . . . . . . . . . . . . . ... 36 . . . . . . . . . . 36 . . . . . . . . . . . . . . 38 2.5.1 2.6 2.7 3 4 5 FastSLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 . 41 Scan matcher and loop closing techniques....... . . . . . . . . . . . . . . . . 41 2.6.1 Scan matching....... . . . . . . . . . . . . . . . 2.6.2 Loop closing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Summary .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 44 Cooperative mapping with multiple relative pose graphs 44 ................................. 3.1 Anchor nodes ........ 3.2 Pose graph formulation... . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2.1 Multiple pose graphs: global versus relative . . . . . . . . . . . . . 47 3.2.2 Summ ary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 54 Experimental results 4.1 Stata Center data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2 Fort AP Hill data set.... . . 4.3 Helicopter and ground robot data set . . . . . . . . . . . . . . . . . . . . . 58 4.4 Performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 . . . . . . . . . . . . . . . . . . . . . . . 58 68 Conclusion 5.1 Summary of contributions 5.2 Future work.............. . . . . . . . . . . . . . . . . . . . . . . . . . . 68 .... .... .... ... . . . . . 69 . . . . . . . . . . . . . . . 69 5.2.1 Open challenges of multi-robot SLAM 5.2.2 Future applications . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.2.3 Towards fully autonomous navigation......... . . . . . . 74 List of Figures . . . . . . . . . 1-1 Helicopter and car collaboration 2-1 Graphical model of the localization problem. ....... 2-2 Graphical model of the mapping problem. . . . . . . . . . . . . . . . . . 20 2-3 Graphical model of the online SLAM problem. . . . . . . . . . . . . . . . 22 2-4 Graphical model of the full SLAM problem. . . . . . . . . . . . . . . . . 23 2-5 Sparsity of R matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2-6 A single robot mapping with iSAM 2-7 FastSLAM implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2-8 Segmentation of laser scans . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2-9 Loop closing examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3-1 An implementation of our algorithm 3-2 From single robot pose graph to multiple relative pose graphs . . . . . . . . 48 3-3 From single robot pose graphs to multiple relativ e pose graph . . . . . . . . 49 4-1 Stata Center data set: sequences of map merging . . . 55 4-2 Stata Center data set: the map overlaid on a floor plan . 56 4-3 Stata Center data set: the resulting map . . . . . . . . 57 4-4 Fort AP Hill data set: sequences of map merging . . . 59 4-5 Fort AP Hill data set: individual maps . . . . . . . . . 60 4-6 Fort AP Hill data set: the resulting map 61 4-7 Encounter between helicopter and ground robot using checkerboard 4-8 Cooperative mapping using a quadrotor and a ground robot: Individual maps . .... .. . 19 . . . . . . . . . . . . . . . . . . . . . 34 . ... . . . . . . . . . . . . . . . 46 . . . . . . . . . . . . 4-9 Cooperative mapping using a quadrotor and a ground robot: the resulting m ap .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4-10 Comparison of convergence speed . . . . . . . . . . . . . . . . . . . . . . 65 4-11 Evolution of the covariance matrix of anchor node . . . . . . . . . . . . . . 66 4-12 Sparse R matrix of Stata Center data set . . . . . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . . . 73 5-1 Heterogeneous multi-robot system collaboration. 5-2 Integration of robotics navigation algorithms . . . . . . . . . . . . . . . . . 74 List of Tables 4.1 Overview of the data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Chapter 1 Introduction 1.1 Perception with cooperative mobile robots Autonomous robots have the prospects of transforming human life. Robots can perform certain tasks with more precision, or repetitive work for long hours. They can also be placed in extreme working environments that a human cannot survive or would want to avoid, or be used to augment strength. Some industries already use robots for tasks such as car assembly and semi-conductor assembly. Having multiple autonomous robots working together can allow us to expand the range of applications. Robots can collaborate effectively by sharing information, and complete the mission faster than a single robot. Moreover, by working together, they can perform more complicated tasks. A team of heterogeneous robots is especially beneficial. One example is collaboration between ground vehicles and aerial vehicles for target tracking. Such a combination of human controlled vehicles is already common. Figure 1-1 depicts such a case, where a police car chases the target while a helicopter helps to track the target from the sky. This collaboration is effective because the helicopter can provide precise target location information even if the the ground vehicle loses sight of the target, while the ground vehicle can use this information to physically apprehend the target from the ground which the helicopter cannot do. This is just one example of heterogeneous multi-agent cooperation. However, it is not easy to build multiple robots with this capability, even building a ................................ ....... .................. ....... Figure 1-1: Helicopter and car collaboration [1]. single autonomous robot can be difficult. Suppose we want a robot to do a task that would be simple to a human. For example, consider the following task: go to the kitchen, pick up a specific cup and bring it back to the office. First, a robot has to know where the kitchen is (mapping) and know how to get there from the current location (path planning). Second, it has to identify the specific cup using available cues such as color, size, shape of the cup (object recognition) and grasp the cup without breaking other dishes (manipulation). Third, it also has to know how to locate itself relative to the office (localization) and then to return to the office. Being able to perform such a task in any environment with arbitrary objects is an unsolved problem. This is because building useful knowledge about the world using noisy sensors is challenging as it involves difficult estimation problems. Different disciplines in robotics investigate different parts of this task. What is common among all these sub-tasks is they require understanding of the surroundings which means solving the perception problem; a solution of which would play an important role as an input to any of the other high level tasks. The perception problem has many subfields, but this thesis focuses on one aspect of the problem - the localization and mapping problem. As Cox et al. said [7] using sensory information to locate the robot in its environment is a most fundamental skill which would provide a mobile robot with autonomous capabilities. 1.2 The localization and mapping problem The ideal autonomous robot would be able to perform in any environment without prior information. The robot's perception system should be able to build a map, and localize itself in that map at the same time, in an unknown environment. This is called the simultaneous localization and mapping (SLAM) problem. There has been much research on SLAM in the past decades including excellent surveys by Durrant-Whyte and Bailey [12], [4] and Thrun, Burgard and Fox [46]. This thesis will focus on the SLAM problem for multi-robot systems. The goal of our work is to develop new multi-robot mapping algorithms that provide robust, real-time performance in complex environments. The ideal solution to cooperative SLAM would have the following criteria: " Consistency " Efficiency e Scaling " Robustness First, solutions for the map and localization should be consistent within a robot and as a whole system, where consistency is defined as the minimum least-squares estimate based on the obtained measurements. Consistent solutions take into account all available information from all robots up to the current time. For example, when a robot comes back to the same location, the information should be propagated throughout the entire map generated since the previous visit. Second, the algorithm should be efficient for real-time operations: complexity should be kept low without making strong assumptions or limiting applications. Third, the algorithm should scale well with the number of robots. Fourth, cooperative SLAM algorithms should be robust. In case of malfunction, intermittent communication or other types of failure, the algorithm still should be able to achieve the task. However, localization and mapping for a single robot or multi-robot are challenging due to the following reasons. First, a sensor introduces uncertainty to measurements. In case of range sensors, a 12m range to an object may actually be estimated as 12.05m or 11.9m due to noise. It can also be wrong measurement, in which case the actual range will remain unknown. Second, there are unknown factors in system modeling and estimation. For example, a mobile platform typically estimates its position by modeling its motion. However, not only the exact model, but also the exact movement of the robot relative to the ground is unknown because of other factors (e.g. ground slip and calibration error). Only the input commands that are sent or sensor data obtained are known. Even if the exact motion model is known it is often nonlinear, due to typically rotational motion. Therefore, approximation or linearization is needed for computational reasons, which may introduce additional error. There are also challenges for single robot applications that become more crucial for multi-robot systems. One is the fact that achieving consistency often conflicts with computational efficiency. The reason is that consistent solutions must consider all collected information, while doing so increases complexity. Maintaining a system which can operate on-line often requires a trade off. This becomes even more challenging when information from multiple robots is considered, further increasing complexity. Related to this issue is computational power, because running algorithms on multiple robots requires more computational power - which is a limited resources. There are also challenges that only occur in multi-robot systems. One is communication between robots. Reliable communication between robots is important, as failure of communication can have significant effects on mission completion. Therefore, dealing with communication failure and subsequent recovery has to be considered to guarantee robustness. Another challenge is designing the architecture for distributed real-time implementation. In addition, the algorithm has to scale well not only to the size of the environment, but also to the number of robots. This thesis presents an algorithm for the localization and mapping problem with a multi- robot system which moves towards satisfying criteria mentioned above. Among the challenges mentioned above, limited communication is outside of the scope of this thesis. 1.3 Structure of the thesis The thesis is organized as follows: Chapter 2 reviews the SLAM problem for one or more robots and summarizes the iSAM technique, developed by Kaess and Dellaert [22], that serves as a foundation for this work. Chapter 3 presents our new algorithm for cooperative mapping based on multiple relative pose graphs. Our key innovation is the introduction of "anchor nodes" to serve as the links between the pose graphs of different robots [23]. Chapter 4 presents experimental results for various data sets. Chapter 5 concludes the thesis by summarizing our contributions and making recommendations for future work. Chapter 2 Previous research This chapter reviews the literature relevant to the cooperative mapping problem. We begin with a review of the basic SLAM formulation for a single robot, and then provide a summary of recent successful algorithms based on pose graph representations. Subsequently, we review previous work in cooperative SLAM and describe the topics of scan matching and loop closing, which are used in our experiments reported in Chapter 4. 2.1 Formulation of the SLAM problem This section discusses details of how to formulate the simultaneous localization and mapping problem, following Thrun et al. [46]. We first describe the two easier problems of (1) localization given an a priorimap and (2) mapping given precise localization information. Subsequently, we combine these two problems to define simultaneous localization and mapping. The spatial information that is available to a robot is inherently uncertain. For instance, a sensor cannot measure the location of an object perfectly, and even if it could, the exact model of the robot's motion is unknown. To address this uncertainty, Smith et al. [42] introduce the stochastic map - a spatial representation based on a network of uncertain spatial relationships. The map contains estimates of the spatial relationships, their uncertainties, and their inter-dependencies. Starting from this work of Smith et al., the field of SLAM has adopted the stochastic map to deal effectively with the inherent uncertainty of spatial relationships and to utilize uncertainties for more intelligent decisions. Having uncertainty as part of a spatial representation also brings up the question of how to combine uncertainty from multiple sensors or different types of sensors to get the right estimate of overall uncertainty. The spatial relationships measured by a robot's sensors are relative relationships (for example, the distance between the robot and a landmark). When different frames of reference are used, uncertain spatial information must be propagated between these frames. Another topic of interest concerns how to utilize estimates of the uncertainty for the robot position and the entities in the map in order to accomplish a given task. For example, it helps to know the confidence level of the estimation in order to decide certain actions. As we will discuss later, loop closing techniques can use this confidence level to decide whether a hypothesis should be accepted or not. Uncertainty estimates can also be used to decide if the available information is sufficient to reduce the uncertainty to tolerable limits or if the mission can be completed without failing due to large uncertainty. There are many benefits from formulating this problem in a probabilistic way. The uncertainty can be expressed mathematically in the form of covariance matrix. This gives a useful mathematical representation for the problem, making it easier both to use the uncertainty estimates or to fuse multiple uncertainty estimates. In addition, a probabilistic formulation can also provide different options to the user instead of providing a single answer. For example, we can choose among multiple candidate estimates depending on the requirements of the task. 2.1.1 Preliminaries An uncertain spatial relationship can be represented by a probability distribution over a collection of spatial variables, which we typically represent by a probability density function. In the case of feature-based maps the variables of interests are X, the pose of the robot, M, the map and Z, the measurements. More specifically, XO:t indicates the pose of the robot from time 0 to t, Xo:t = {xo,Xl, '- ,Xt- 1,Xt }, where xi represents the pose of the robot at time i. M is the landmark position (x and y in two dimensions). Z= ZO:t {ZO, Zi,...,Zt -1, Zt}, is a list of measurements from the sensors. ZOt indicates measurements from time 0 to t where zi represents the sensor measurement at time i. The probability distribution p(X,MlZ) models these variables. We denote the estimated value of the robot poses as the expectation of random variable x (2.1) x= E(x) and define i to be the difference between the current value and the estimated value: (2.2) x = x -X The matrix C(x) is the covariance matrix of x, (2.3) CQx) = E (.jj) The covariance C(x) plays an important role in probabilistic robotics as it contains information not only about the individual state uncertainty but also about the correlation between states (which is of vital importance in SLAM). The position of the robot, x, is often assumed to have a normal distribution, in which case it can be written as 1X p(x)= 1 )C1( (2 )c exp[--2x -5DC-1(x-] It is sensible to assume a normal distribution for the variable x for two reasons. First, in Bayesian probability, the principle of maximum entropy states that, subject to known constraints (called testable information), the probability distribution which represents the best knowledge of the current state is the one with largest entropy. Given only the mean and the variance of a multivariate probability distribution, the distribution which assumes the least information is the normal distribution [42]. Second, if the spatial relationship is calculated by combining many different pieces of information, the central limit theorem can be used for reasoning that the resulting distribution will tend to be a normal distribution. Figure 2-1: Graphical model of the localization problem. 2.1.2 The localization problem The goal of the localization problem is to determine the pose of a robot relative to a reference point in the map. As this statement indicates, a reference point is assumed to be given and the map that it belongs to is also assumed to be given. The graphical model in Figure 2-1 is often used as a concise expression of the localization problem. Each circle represents a variable and a shaded circle signifies a known variable. The inference problem that we want to solve with this graphical model is the estimation of x given the control input u, the measurement z, and the map m. A robot uses sensors to infer its position from collective measurements z (using the measurement model, Equation 2.9) and control input u (using the motion model, Equation 2.8). For example, if the sensor is an inertial measurement unit (IMU), the robot integrates acceleration and angular velocity information over time to obtain a position estimate. If the sensor is a laser scanner, a technique such as scan matching can be used to estimate the robot's motion. Using a feature-based map representation, the localization problem is concerned with estimating the probability distribution P(XIM, Z, U) (2.4) of the history of robot poses, X, given the locations of the landmarks M, the measurements Z and the odometry measurements U [30]. Figure 2-2: Graphical model of the mapping problem. The localization problem has been widely studied; the reader is referred to Thrun [46] for a review of the basic techniques, including Monte Carlo localization and extended Kalman filter localization. 2.1.3 The mapping problem The goal of the mapping problem is to collect spatial information of surroundings and store the information in a suitable format. This assumes that accurate knowledge of the location of the vehicle is available at any time. Figure 2-2 shows a simple graphical model of the mapping problem for a feature-based map, where shaded circles represent known variables. The mapping problem in a probabilistic formulation concerns finding the distribution of the map M, given the history of the robot pose X, the measurement, Z, and odometry, U. P(MIZ,X, U) (2.5) The motivation for the mapping problem arises from the lack of availability of good models for environments of interest. Robot environments are usually highly dynamic, and so even if an a priorimap of an environment were available (via blueprints for a building or Google maps), such maps would not be sufficient for robot navigation because they would quickly become out of date as the world changes. Dynamic objects such as people walking around or furniture temporarily placed or newly built buildings make existing maps inconsistent with the environment a robot has to deal with. The ability to make and maintain maps of the environment is clearly vital for the safe and effective operation of a mobile robot. 2.1.4 Combining the localization and mapping problems: SLAM The localization and mapping problem is a "chicken-and-egg problem" [31]: given a map, we can localize the robot in the map; given the robot trajectory, it is possible to build a map. Simultaneous localization and mapping entails solving these two problems at the same time. In other words, it is an algorithm that allows a robot, placed at an unknown location in an unknown environment, to build a consistent map while estimating its location within this map. A probabilistic formulation of the SLAM problem consists of continuous and discrete parts for a feature based map. The position of a robot X and the map M are continuous variables, while correspondences of landmarks to generate the map are discrete variables. Solutions to the SLAM problem can be categorized as one of two types: online SLAM or full SLAM [46]. The main difference between full SLAM and online SLAM is that the full SLAM solution can update the entire trajectory of the robot, whereas online SLAM is limited to estimating the current or latest location of the robot without being able to correct estimates of previous poses. .... .... I ...... Online SLAM versus full SLAM Figure 2-3: Graphical model of the online SLAM problem. Each approach has advantages and disadvantages. Ideally, we would like to obtain a consistent solution for the entire trajectory at each time step as the robot operates. However, in practice, if the robot needs to perform a task in real-time, calculating a full posterior may require unreasonably large computational power, possibly causing delay. The solution to the full SLAM problem with unknown data association is the posterior distribution P(xi:t,M, C1:t|zl:t, Ui:,) (2.6) where xi:t is the state vector, m is the map, ci:, are the correspondences, zi:t are the mea- surements, and ui:t are the control inputs for all time. The corresponding graphical model is presented in Figure 2-4, where variables that will be estimated are framed by rectangles. The online SLAM solution, on the other hand, can be obtained by integrating all past poses and summing over all past correspondences to get the current pose and latest map. P(xt,,m,ctz1:tui:t) = .. -.. E P(x1:t, m,ci:tlzi:t,ui:t)dxi . . -dx,_ 1 (2.7) C1 Cl-1 There are number of different approaches used to solve full SLAM and online SLAM. An example of online SLAM is the extended Kalman filter SLAM; examples for full SLAM :---_-. .................................. , ::: . . .......... include smoothing and mapping (SAM) [22] and GraphSLAM [49]. Derivation of full SLAM with known data association Figure 2-4: Graphical model of the full SLAM problem. We will focus on the derivation of the mathematical expression for the full SLAM problem (assuming that data association is known) which we extend in this thesis to deal with multiple robots. We will factorize the probability distribution P(xi:t,mIzi :r, u1:t, C :t) into parts such that each part can be modeled using known knowledge of the system (e.g. motion models and sensor models). The difference from Equation 2.6 is that the correspondences cl:t or data association is assumed to be given. As to the factorization of this distribution, using valid assumptions about the nature of the problem can largely simplify the problem. This derivation is following Thrun [46]. We combine the state variable x and map m into yo:t = Yt = ( I- Bayes rule enables us to factorize P(yo:,lIz1:,, ui:t, c1:,) = IIP(z,|Iyo:,, z1:,-1, ui:t, c1:,)P(yo:tIz1:,--1, ui:t, c1:,) where 17 is the normalizer. After dropping irrelevant variables, the first probability on the right-hand side becomes P(zt lYO:,, zi:,-1, ui:,, ci:,) = P(z, |yr, c,) By partitioning yo:t into xt and yo:t- 1, the second probability on the right-hand side becomes P(YO:r Izi:,-1, ui:,, ci:t) = P(x,|YO:,--1, zi:,- 1,ui:,, c1:,)P(YO:,-i lzi:,-1, ui:,, c1:,) = P(x, Ix,_1, u,)P(YO:,-i lz1:,-1, ui:,_1, ci:,-i) After incorporating the latest measurement zt, this posterior becomes P(yo:,lzi:,,ui:t,,ci:t) = 17P(ztlyr, c,)P(x,|xt,_1ut)P(yo:,-ilzi:,-1,ul:,-1, ci:,-1) = 7 P(yo) B[P(xtlxt-1,ut)f P(ztiyr,ci)]. t i The two main parts which remained are called the motion model P(xt|xt-1,ut) and the sensor model P(zilyt, c.i) The motion model describes the transition of the robot's position given the previous position and the applied control inputs. Motion models are often non-linear and can employ different available information. A velocity model uses velocity information, whereas the odometry motion model uses odometry sensor measurements. The model is often written with the non-linear function g in the following manner: x = g(x)+w (2.8) where w ~ N(O, K2) is Gaussian white noise. Assuming a Gaussian distribution for the state variable x considerably facilitates problem solving. The distribution can be written as P(xt IxtI, ut) C< exp - g(_1,2-x| where we use the notation IIzI2= z'f-z. The sensor model extracts useful information from sensor returns. It can be also considered as a prediction step consistent with the concept of the Kalman filter. This model is also usually non-linear with added Gaussian white noise, v, and is described by z = h(x) +v (2.9) where v ~ N(O,E) and h is a non-linear function. If a Gaussian distribution is assumed, the distribution can be written as P(zt xt, ct) c< exp -||h(xt, ct) -Zt||) 2.2 Pose graph methods: smoothing and mapping Many different types of representations have been proposed for SLAM, and one of the most popular and successful state representations in recent years is the pose graph method. In pose graph SLAM, we represent the trajectory of the robot as a chain of poses, and we use a graphical model to represent the constraints conveyed by measurements including odometry and external measurements of the features of the environments. Pose graph methods use optimization techniques to provide the best answer given all the information. The pose graph method has advantages over filtering in many applications, while filtering has benefits that the pose graph method does not have. One of the interesting works on this matter is done by Strasdat et al. [16]. In this work, filtering and optimization approaches to monocular SLAM are compared and the trade off between accuracy and computation speed is analyzed via simulation and real images. Filtering marginalizes out all the poses other than the current one after every frame which keeps the graph relatively compact. If the robot's movements are repeated in a restricted area, the graph will not grow at all, since persistent feature variables are only added when new areas are explored. However, existing graphs will become more and more interconnected, since elimination of variables introduces connections between exiting nodes in the graph. New links between every pair of feature variables have to be stored and updated as more measurements are made. Optimization, on the other hand, executes batch updates every time it gets a new measurement, but keeps complexity manageable by selecting a set of key frames. Unselected poses and measurements connected to them are not marginalized out as in filtering, but simply discarded. Therefore it does not create full interconnection between nodes like filtering does. As such the optimization method stays sparse, and it is therefore relatively efficient even with a high number of features in the graph. In some cases, it is more sensible to marginalize the past information. In other cases, it is better to discard the information. Marginalization enables the past information to be propagated throughout time, whereas discarding the past information enables the batch optimization to be feasible. Discarding information might sounds like a bad idea, but marginalizing all information may propagate wrong or less-accurate information through the solution when all the information we need may already be in the selected key frames. Strasdat et al. [16] conclude that filter based SLAM frameworks might be more beneficial if only a small processing budget is available, but bundle adjustment optimization is generally superior elsewhere. 2.2.1 Smoothing methods for the SLAM problem Smoothing and mapping provides a popular solution for pose graphs. The definition of smoothing in estimation theory is the following from [14]. At time t, the error is defined as e(t) = s(t + a) - x(t) where s is the output of the system. The performance criterion is defined as the minimum mean-squared error. When a is positive, we call this a prediction problem; it tries to predict the signal variable a time units ahead of the present time t. When a is zero, it is the filtering problem, and when a is negative, it is the smoothing problem. In the smoothing case, the estimator is trying to estimate the signal value a time units in the past. In SLAM, it is used to update past states as more information becomes available. This corrects accumulated errors in order to keep the error e(t) small. If the error grows big and we cannot fix it, and the robot travels further and longer, the estimation is no longer meaningful. Smoothing has been widely used to create an approximating function that can capture the patterns of data, leaving out noise or other fine-scale structures and rapid phenomena. Smoothing and mapping (SAM) in next section and incremental smoothing and mapping (iSAM) in Section 2.3 are examples of smoothing. The reason we want to use smoothing rather than filtering is that it generates consistent solutioni throughout the given data, although recursive methods are preferred for online operations. Filtering is proven to be inconsistent for non-linear SLAM problems, diverging from the true solution as shown in [20] due to the linearization. On the other hand, smoothing also solves backwards in time, keeping the solution consistent. The first smoothing approach to the SLAM problem, as a network of constraints between poses, was suggested by Lu and Millios [32]. The first implementation of a smoothing method was performed by Gutmann and Nebel [15]. Their method is also known as "bundle adjustment" or "structure from motion" in the computer vision community. These methods are introduced in [50] and [17]. 2.2.2 Square root smoothing and mapping (square root SAM) Square root SAM is one of the popular methods that uses smoothing for the SLAM problem. In short, square root SAM applies sparse linear algebra and factorizes the information matrix to achieve effective calculations. To begin the discussion, we will formulate the SLAM problem as a least squares problem following the formulation from Dellaert [11]. We would like to estimate the most likely solution to the variable X given all measurements, Z. The prior P(X) is often assumed to ithe minimum least-squares estimate based on the obtained measurements. be uniform. In this case the maximum likelihood solution, X*, becomes X* = argmaxP(XIZ) (2.10) = argmin-logP(X,Z) (2.11) x x = arg min ( X |hk(X)- k k=1Z 1 )(2.12) (.2 where hk(X) is the measurement model for kth measurement from Equation 2.9. The next step is to linearize this model. Using a Taylor expansion, h(X) = h(XO)+H(X -Xo)+--- where H dh(X) dx .The linearized maximum likelihood problem becomes argmin(HX -(^-h(Xo)-HXO)) T X x = argmin(HX x - argmin(AX -b) x where A (2.13) c)TE-1 (HX T - (HX -(-h(XO)-HX)) c) (AX -b) E-2 H, and b = E-Ic. Now the SLAM problem can be written as a standard least-squares problem. 3* =argmin3 IA3 - b |2 (2.14) where 3 is a vector that we are solving for. In the linearized maximum likelihood problem above, 3 represents X. We are looking for the least squares solution that agrees most with the given data, the optimal solution 3*. For a full rank m x n matrix, with m > n, the optimal solution for this least squares problem can be found from the normal equations AT A3* =ATb (2.15) by setting the first derivative of IA5 - b 1| to be zero. Solving the normal equation might look straightforward at first sight: simply invert ATA to solve for 3*. However, with many measurements, the matrix A becomes very large, so it quickly becomes computationally very expensive to invert A TA matrix. One way of solving this without calculating the inverse is to use the Cholesky factorization of the information matrix E = AT A=RT R (2.16) where R is an upper triangular matrix, as suggested in [11]. Define y = R3* and solve Riy T=ATb (2.17) R * =Y. (2.18) for y. To find 6*, we then solve This is an efficient approach because R is an upper triangular matrix, and we can solve Equation 2.17 and Equation 2.18 by back-substitution. Alternatively, the QR decomposition of A can be used so that the information matrix does not have to be calculated explicitly. The QR decomposition makes A into the form R A =Q 0 where Q is an orthonormal matrix and R is an upper-triangular matrix. iSAM, which will be studied in detail in Section 2.3, is a method that uses this QR decomposition to achieve the incremental solution efficiently. There are advantages of SAM compared to the extended Kalman filter SLAM (EKF SLAM). One is that the information matrix stays sparse whereas both the EKF covariance matrix and information matrix become fully dense over time, as studied in [39] and [48]. This fact allows SAM to calculate the solution more efficiently, because the information matrix of SAM and the measurement Jacobian are compact representations of the map covariance structure. Efficient use of a square root R of the information matrix is another advantage of SAM, since R can be used immediately to obtain the optimal trajectory of the robot and the map. 2.2.3 Efficient large-scale smoothing and mapping using submaps Ni et al. [38] developed Tectonic SAM (TSAM), a large-scale mapping approach based on submap decomposition. The algorithm represents a global feature-based map by dividing it into submaps that have weak correlation with each other. The idea is to represent the correlation between submaps as a single link rather than dense links with very weak correlation. Tectonic SAM uses "base nodes" to represent the relative pose of each submap, providing a computational speedup as submaps can be efficiently separated from one another. Even though it was an algorithm for a single robot system, it is relevant to our work because it uses relative representations as we do in a different context. Tectonic SAM addresses batch, single robot, large-scale mapping using submaps, whereas our focus is on online multi-robot SLAM in a common global frame without separately optimizing submaps and a separator. 2.3 Incremental smoothing and mapping The incremental version of SAM is called incremental smoothing and mapping (iSAM), combining two desirable characteristics - smoothing and recursiveness. If SAM, discussed in the previous section, allows us to apply smoothing methods to obtain consistent solutions, iSAM allows us to obtain these consistent solutions in a real-time fashion. iSAM employs QR factorization to obtain the solution directly from the square root factor, denoted as matrix R. With Givens rotations for efficient updates and variable reordering for efficiency, iSAM provides a consistent solution for real-time operations. In this section, the details of iSAM are discussed. QR factorization A rectangular matrix A E R'"X" can be factorized into A = QR = Q Q1 Q2 = QIR1 (2.19) 0 0 using QR factorization, where R1 is a n x n upper triangular matrix, Qi is m x n, Q2 is m x (m - n), and both Qi and Q2 have orthogonal columns. If we require positive diagonal elements of R, and A has full rank n (where m > n), then R1 and Qi are unique. In this case, R1 is equal to the upper triangular factor of the Cholesky decomposition of ATA that is used in SAM. Advantages of using the QR method are that it is known to be numerically stable, and unlike Cholesky factorization, QR factorization does not need to compute the information matrix I[ from Equation 2.16. There are several methods for computing the matrix Q and R including Gram-Schmidt process, Householder transformations and Givens rotations. iSAM employs Givens rotation for efficient incremental update of R. Incremental solutions using Givens rotation After each measurement, one row representing the measurement is added to the square root matrix R. To update this information in the form of the upper triangular matrix R, iSAM [22] uses Givens rotation, which is one of the keys to efficient calculation of the solution. The Givens rotation matrix is defined as following: D where ' cos -sin# sin' cosJ (2.20) is chosen such that the element below the diagonal becomes 0. Thus performing a Givens rotation on R matrix cleans out one element below the diagonal at a time. After performing this for all non-zero elements below the diagonal, an upper triangular matrix is obtained. If we have a dense matrix R E Rnxn to update, in worst case, this operation has n2 to be executed a maximum of -- times. 2 However, in the SLAM problem, this step can be performed every time a measurement arrives. The new measurement corresponds to a new row in R. To make R into an upper triangular matrix, zeroing out one additional row at a time is enough, therefore at most n elements need to be zeroed out. In addition, R is usually sparse in SLAM, and the new measurement is often related to the latest variable (i.e. the variable at the very bottom of the state vector), therefore only the rightmost part of the matrix has non-zero entries. By combining QR factorization and a Givens rotation, the new R matrix can be obtained at each step without having to re-factorize the A matrix, and then iSAM uses this R to directly obtain the solution. Solving SLAM as a least-squares problem via QR factorization Equation 2.10 formulated the SLAM problem as a least-squares problem, and now we will look at how to obtain the optimal solution for this least-squares problem by applying QR factorization. The process follows [22]. If 3* is the least squares solution then 3* = argming||A5-b|| IQT A5 argmin|QI - QTb|| 2 argming||R5- d|2 + Ile1 where I e| | is the least-squares residual, and QTA where Q is an R 0 rQTb= d (2.21) e m by m orthogonal matrix, and R is the upper triangular matrix from QR factorization. The solution 3*can be obtained by back-substitution from R5* =d. (2.22) 1000 1500 2000 2500 0 500 1500 1000 nz = 19633 2000 2500 Figure 2-5: Sparse R matrix from [22]. iSAM uses the sparsity of the R matrix to compute the solution efficiently. R is used directly to obtain the estimated path of the robot. In fact, Q never has to be explicitly calculated. Instead d is calculated during the factorization as it is needed for the backsubstitution in Equation 2.22. The computational power required for back-substitution is largely dependent on the sparsity of R. For general dense matrices, back-substitution has quadratic time complexity. However, R is usually sparse in SLAM problems as shown in Figure 2-5. Taking advantage of this property and using the techniques presented in this section, iSAM produces consistent optimal solutions in a recursive fashion. Variable reordering In addition to the techniques used in iSAM discussed above, variable reordering is another key success factor of iSAM. Non-zero entries beyond the sparsity pattern of the information matrix are called fill-in. Having more fill-in increases computational complexity, especially when the matrix contains information on relationships between previous states. Variable reordering is a well-studied topic in sparse linear algebra to find an order of variables that reduces the fill-in. The order of variables is claimed to be a very crucial factor for computational efficiency according to [11]. Finding the optimal order is an NP hard problem, but heuristics exists such as COLAMD (column approximate minimum degree). The details of this algorithm are in Davis et al. [9]. This reordering technique dramatically improves the sparsity as shown in [11]. ............... .... ............ . ................. .......... .- -- - -. .............. .... .... ............................. ........ .... . ......... .... .... .. I1/> Figure 2-6: Mapping an indoor environment using iSAM. The robot trajectory is visualized in blue by plotting time as elevation of the robot trajectory. Red lines show loop closing constraints within the robot trajectory. iSAM example An example of results from iSAM is shown in Figure 2-6. The data were taken from an indoor environment by a mobile robot with a SICK laser scanner. The data, consisting of 11473 scans, were collected over a period of about 40 minutes. The total mapped area is about 100mx 55m, and the total length of the trajectory is about 278m. Limitations of iSAM While iSAM has numerous desirable properties, it is also important to know the limitations of the algorithm. First, as a common assumption for any optimization method, iSAM assumes that the initial estimate is sufficiently close to the global solution. A metric for how close is sufficient has not been quantified. One reason is that the geometry of the .............. ... ................. cost function is different for different structures of particular data sets, making it hard to standardize such metrics. Second, as for other SLAM methods, convergence to the global solution is not easy to show. Simulation data where we have ground truth shows that iSAM converges to the global solution. However, for real data, we can only check for convergence to a minimum not necessarily the global solution, and for example compare the resulting map with building blueprints. Alternatively ground truth can be gathered from survey grade GPS or other external references, but this can be very difficult or expensive to do. Third, the iSAM method developed by Kaess that is used in our work occasionally uses a batch update step to ensure that it converges to the global solution. However, fully incremental iSAM without batch steps has recently been developed in [21]. 2.4 Cooperative SLAM Much research has been done on multiple vehicle robotic systems as in [26]. In particular, the mapping problem has been investigated in various contexts. To mention a few, a stochastic estimation algorithm for collaborative localization is discussed in [41], and simple navigation with multiple robots was demonstrated in [6]. Combining mapping and exploration with multiple robots using behavior-based control is studied in [33], and combining topological maps collected by multiple robots is discussed in [10]. Among all interesting works on multi-robot systems, this section will focus on the multi-robot SLAM works that are relevant to our algorithm. 2.4.1 Cooperative robots for reducing odometry error Early work on multi-robot systems was motivated by the desire to reduce odometry error using multiple robots. Kurazume et al. [28] divided the robots into two groups. Only one group moves at a time, while the other group remains in position. Once in a while, all robots stop, perceive their relative position, and use this information to reduce errors in odometry. Further work by Rekleitis et al. [40] also had one robot move at a time, while the stationary robots tracked the position of the moving robot. It was shown in this work that the multi-robot system can achieve more accurate position estimates than with a single robot. 2.4.2 Cooperative SLAM using Kalman filter Williams et al. [51] use a Kalman filter approach with local submaps which were uncorrelated with the global map on each vehicle. The local submaps could then be transmitted to other vehicles and periodically fused into a global map. However, this approach is not robust to communication failure. The submaps are essentially an increment of information, and if one of these messages is corrupted or not received, the information it contains is not encapsulated in any future transmissions. Fenwick et al. [13] extend single robot EKF SLAM to multi-robot SLAM, and prove a convergence theorem for the first time. By this theorem, one can quantify the performance gains of collaboration for multi-robot systems, and determine the number of cooperating vehicles required to accomplish a task. 2.4.3 Cooperative SLAM using Monte Carlo methods Monte Carlo sampling methods are used in various multi-robot SLAM algorithms. Thrun et al. [45] combined maximum likelihood estimation with a posterior estimation technique for mapping cyclic environments with multiple robots. In this work the initial pose of the robots relative to each other is assumed to be unknown, posing an initialization problem. However, it is restrictive in that it requires one robot to be a "team leader" to initialize all other platforms, and these platforms have to build their own map only within the map of the team leader. Howard et al. [18] developed a particle filter based multi-robot SLAM algorithm with experimental results for cooperative mapping by four robots using laser sensing. The approach assumes that the first encounter between two robots is perfectly accurate, making it unnecessary to consider subsequent encounters. A key motivation for our work is to relax this assumption, allowing data to be shared for multiple, uncertain encounters, converging towards the optimal solution over time, sacrificing neither consistency nor computational efficiency. Unlike the completely coincidental encounters in [18], Konolige et al. [24] deployed active encountering coupled with exploration. A particle filter is used to verify the hypothesis of map matching. Each time, a robot decides if it is better to move to an unexplored area, or to move to a position where it can verify a hypothesis regarding the other robot's location. If a robot decides to verify a hypothesis, it sends a message to the other robot to stop, and moves to their estimated location. If the estimated location was approximately right so that two robots detect each other using laser finders, their maps are merged. If verification fails, the hypothesis is deleted. However, since the complexity of estimating map matches is exponential in the number of robots considered jointly, they only deploy 2 robots. An extension of this work is [25], where 66 robots, a relatively large number, are used. However, mapping is still performed with only two robots. They also consider intermittent communication, and their system is a distributed system. There are no assumptions about the robots' relative start locations. However, similar to [24], the "leader" robots have a complete and consistent map representing the data collected by all robots in the cluster, while team members only store and broadcast their own map and their observations. Occasionally, this map is broadcast to the other robots, in order to guarantee consistency. 2.4.4 Information form SLAM Thrun and Liu [47] formulated the multi-robot SLAM problem using the sparse extended information filter. This work highlighted the importance of the initialization problem determining the relative pose of one robot to another at the first encounter. Whereas most early work in multi-robot SLAM assumed that initial relative poses of the robots are known ([13] and [37]), Thrun and Liu handled initialization within the SEIF framework. 2.4.5 Smoothing and mapping approaches C-SAM by Andersson et al. [2] employs smoothing for a cooperative multi-robot system. The algorithm is essentially a multi-robot version of SAM. The C-SAM algorithm has some similarities to our approach, but also has several crucial differences. First, the focus of C-SAM is on the alignment and merging of multiple feature-based landmark maps [44], whereas our approach is based on pose graph representations. Second, C-SAM is a batch algorithm, whereas our algorithm is recursive. Further, our algorithm has been extensively tested with real data whereas published results for C-SAM only cover simulations with two robots. 2.5 Alternative approaches 2.5.1 FastSLAM Another notable approach using a filtering concept that is fully/partially recursive is FastSLAM. FastSLAM uses a modified particle filter to recursively estimate the posterior over robot paths, as it keeps multiple data association hypothesis. FastSLAM scales logarithmically with the number of landmarks and can solve either the full SLAM or online SLAM problem. FastSLAM essentially applies recursive Monte Carlo sampling for localization and mapping, but the derivation of the probabilistic formulation of the problem is not exactly simple. Without presenting too much mathematical detail, this section will focus on giving the essential elements of FastSLAM. FastSLAM decomposes the SLAM problem into a robot localization problem and a collection of landmark estimation problems that are conditioned on the robot pose estimates. The key idea of this factorization is that the locations of landmarks are independent given knowledge of the robot's path. Thus given the robot's path, K (the number of landmarks) Kalman filters are used to estimate landmark locations. The formulation becomes Z u',nt)nO(kjS't,tu'A') p(s', 0|zt, u' n') = p (Stz' (2.23) k where s stands for robot poses, 6 is set of all landmarks, u the robot controls, z the landmark measurements and n is the index of the landmarks perceived. The factorization is exact, and the first part of the factorization p(st iz, u', n) is computed using a modified particle . ............... Figure 2-7: FastSLAM implementation. The data set was taken at a parking lot, and light poles are used as features. Green star marks represent the true position of selected features, while red marks are particles estimating the locations of those features using a particle filter. The blue line is the trajectory of the robot. filter, which gives a good approximation of the posterior even under non-linear motion kinematics. In the second part of the factorization the p(Okls' ,z, u', ) are landmark pose estimates computed using Kalman filters. As mentioned earlier, FastSLAM uses separate filters for different landmarks and the location of landmarks depend on the path estimate. Similar to Monte Carlo localization (MCL), particle filter path estimation in FastSLAM maintains a set of particles representing the posterior. Each particle, denoted as s[], rep resents a hypothesis of the mth particle of the robot's path at time t. Then next particle set, denoted as St, is calculated incrementally from the set St-I which is used to generate a probabilistic distribution of the robot's pose at time t s[M] p(stluts m] )2.24) ( and is obtained by sampling from the probabilistic motion model. Each particle s[i] is drawn with a probability proportional to a so-called importance factor w) A naive implementation requires O(MK) running time (where M is the number of particles in the particle filter, and K is the number of landmarks). FastSLAM reduces the running time to O(MlogK) using a tree-based data structure, as described in [34]. Advantages of FastSLAM include the possibility of dealing with a large number of fea- ............... :: :: ::::::::::: ........... F T 2 ........................... :_: es 7 / Figure 2-8: Segmentation of laser scans. The data set was collected from a mobile robot, the MIT DARPA Grand Challenge car, with a SICK laser scanner. The SICK laser scanner returns a collection of laser points. We group laser points that are close together, and some of these groups become features. Different colors represent different groups. Features are manually labeled after grouping. tures, which can be useful to map large environments. Traditionally, the robot pose and the landmark locations are all estimated with a single Kalman filter with resulting covariance matrix size of (K+ 1) x (K+ 1). This makes the cost per iteration 0(K 2 ). FastSLAM, with a running time of O(MlogK) scales relatively well with the number of landmarks. Empirical results show that FastSLAM seems to scale up well with small particle sets (K > M). Another advantage of FastSLAM especially over EKF is that each particle can keep different data association decisions. FastSLAM approximates the full posterior, not just the maximum likelihood data association, which makes the algorithm somewhat more robust to wrong data association. However, this algorithm has limitations. The dimension of the state space constantly grows, but the number of particles is not adjusted accordingly. This leads to particle depletion, which means that there are no particles in the vicinity of the correct state. Therefore, FastSLAM will eventually fail to find the correct solution. Figure 2-7 shows an implementation of FastSLAM on a data set collected from a mobile robot, the MIT DARPA Grand Challenge car, with a SICK laser scanner. The data set was taken at a parking lot, and light poles are used as features. These features are manually labeled after grouping laser returns that are close together as shown in Figure 2-8. Green star marks represent the true position of selected features, while red marks are particles estimating the locations of those features from the particle filter. The blue line is the trajectory of the robot. 2.6 Scan matcher and loop closing techniques Constraints for the pose graph are often obtained from scan matching laser data. Scan matching is a method to find the relative position of two or more scans, either consecutive or not. Loop closing is the problem of finding the places that the robot has traveled before, for example using visual cues [8] or scan matching. 2.6.1 Scan matching There are many different ways to match scans, and the following is the method we adopted from [3]. First, we build a model of the environment. If we compare two scans, A and B, scan A is projected to a grid cell map, forming a model of the environment. Then scan B is projected onto this map. Candidate transformations within the search region are evaluated depending on how good the overlap between the two scans is. When a certain candidate transformation of scan B overlaps the most with the map based on scan A, the candidate is chosen to be the best solution and the corresponding transformation is published as the output of the scan matching algorithm. This technique is useful and important because we can get relative positions between certain time steps with good accuracy. This can be used either as a replacement for an odometry sensor, or, as will be discussed in next section, for loop closing to fix error accumulated over time. 2.6.2 Loop closing Loop closing is the problem of finding the relative position between different time steps when the robot returns to a previously visited place in order to correct accumulated error. The reason why loop closing techniques often dramatically improve the quality of the map is that they fix accumulated error. Estimation error accumulates because sensors integrate noisy measurements over time to get position estimates, but they also integrate the error. Unless the error is fixed by the loop closing technique, the error grows large fairly quickly. Although odometry sensors that have fairly small error exist, they are often very expensive. Even with these accurate sensors, as the robot travels larger distances, error will accumulate and become an important factor in influencing the quality of the map. Thus, loop closing is a very crucial step in robot navigation. There are many different methods of how to close the loop. Different criteria or different sensors can be used to recognize the places the robot revisits and when it does so. Cummins et al. [8] use visual cues with a technique called "bag of words". When laser sensors are available, scan matching can be used to provide the constraints. Depending on the result of the evaluation using these methods, a loop closure can be accepted or rejected. Evaluation for a loop closure using scan matching can, for example, be based on the ratio of laser points that can be successfully matched. If the loop closure is accepted, the information is added as additional information between two poses. With the pose graph optimization method, this is simply added as another constraint, equivalent to other constraints coming from an odometry sensor. The optimization process then finds a solution that agrees well with all constraints in the sense of minimizing the sum of squared error. 2.7 Summary This chapter has reviewed the previous literature related to the thesis. We have provided a brief overview of the SLAM problem statement and have described the iSAM technique which forms the basis for our new cooperative mapping algorithm, which is described in the next chapter. 4~ /4/ (a) Without loop closing (b) With loop closing Figure 2-9: Loop closing examples for a single robot. Accumulated error can be corrected by finding relative position between different visits of the same place. Chapter 3 Cooperative mapping with multiple relative pose graphs This chapter describes a novel method that facilitates multi-robot mapping based on multiple pose graphs. The goal is a scalable, general purpose technique that can be applied to multiple mobile robots and/or multiple missions executed in the same environment simultaneously or at different times. Our method can also be generalized to other types of SLAM methods, although results presented in this thesis use one of the the latest pose graph methods called iSAM [22] as a core engine. The main contribution of our work, anchor nodes, is discussed in this chapter followed by the details of our algorithm. 3.1 Anchor nodes One of the challenges in multi-robot mapping is merging maps from multiple robots, if the robot's initial relative position are unknown. Maps from different robots typically have different reference points, and those maps have to be merged into a global map with a shared reference point. Maps from different robots are merged into a global map when encounters take place. Encounters occur when the robots detect one another or have overlap between their maps. There are two types of encounters defined in this thesis. Direct encounters occur when one robot observes another robot, providing a constraint on the relative pose of the robots at a single time step. An indirect encounter occurs when two robots observe the same part of an environment - not necessarily at the same time, allowing a constraint to be estimated between the positions of two robots at the respective time steps. One of our contributions to the cooperative mapping problem is using "anchor nodes" to facilitate integrating inter-robot constraints obtained from encounters while providing consistent and recursive solutions. An anchor node represents the relative position of the map of a robot with respect to the global reference point. Each robot has its own map and its own anchor node. Using these anchor nodes, as soon as robots encounter one another, all the past data from each robot is available to other robots immediately in a common reference frame; there is no need to adjust each pose graph or to reprocess previous data that another robot has acquired. When the first encounter occurs, the pose graphs for the respective robots are not modified at all and only the anchor nodes change; when subsequent encounters occur, information can propagate between the two pose graphs similar to the single robot loop closing case. Using efficient updates with anchor nodes, real-time implementation is feasible. In addition, our approach does not assume synchronization and can be used in various applications such as heterogeneous cooperative mapping as shown in Figure 3-1. More details of the algorithm are provided in the next section. The use of anchor nodes allows us to go beyond work done by Howard [18] and Andersson [2] in several ways. Unlike [18], our approach deals with uncertainty in inter-robot constraints, and handles multiple encounters between robots. In addition, compared to [2], our approach is fully recursive and provides a globally consistenti solution for all robots in real time speed. These anchor nodes are conceptually similar to "base nodes" suggested in earlier work by Ni et al. [38] in the context of large-scale submaps and batch processing. A detailed comparison is given in the next section. 3.2 Pose graph formulation Our algorithm uses pose graphs for multi-robot cooperative mapping. The motivation for our work is to extend these pose graph SLAM techniques so that they may be applied in a Ithe minimum least-squares estimate based on the obtained measurements. . :........ ......... .................... I --P'-'.r.-_-_-..... ............ _ X__ _- -----------..... ............ ........... ..... Figure 3-1: One of implementations of our algorithm. (a) By using anchor nodes for efficient solution calculation, real-time implementation is feasible even with a heterogeneous set up. Details of the data set and map are presented in the next chapter. (b) We show an enlarged part of the map at the time of an encounter, overlaid on a floorplan. Data from the ground robot is shown in blue, for the quadrotor in red, and the encounters are shown in green. See Figure 4-9 for the full map and more details. 46 multi-robot setting, yet fast enough for real-time operation. Our method provides a recursive solution, while still providing a globally consistent solution for all robot trajectories and the relative parameters in real time. The details of our algorithm are presented in this section. 3.2.1 Multiple pose graphs: global versus relative We present a probabilistic formulation of the multi-robot SLAM problem based on pose graphs. Pose graphs are a common solution for single robot localization and mapping, in which all current and past robot poses form a Markov chain connected by odometry measurements, with additional constraints between arbitrary poses for loop closing events. Solving a pose graph formulation is also known as smoothing as studied in Section 2.2.1. In contrast to filtering, smoothing allows one to correctly deal with nonlinear measurements. Several algorithms for solving the pose graph in a recursive formulation have been presented. We choose to use incremental smoothing and mapping (iSAM) [22], because it provides an efficient solution without need for approximations, and allows access to the estimation uncertainties. We first extend iSAM to deal with multiple robot trajectories simultaneously in a global reference frame. We then present a relative formulation that is advantageous in terms of faster convergence speed, while still providing a globally consistent solution. Multiple pose graphs in a global frame We first formulate the multi-robot mapping problem in a single global coordinate frame using one pose graph for each robot trajectory. We will have a series of factor graphs as representations for intuitive description of each step. Figure 3-2a shows an example for two robots, with the Markov chains for each robot clearly visible. The pose variables are shown as blue shaded circles, and measurements as small black discs. For R robots, the trajectory of robot r E {0... R - 1} is given by Mr + 1 pose variables {x } d. As each trajectory by itself is under-constrained, we fix that gauge freedom by introducing a prior pr for each trajectory r. The prior can be chosen arbitrarily, but for simplicity is chosen to be the ................. . -a -------- .............. p0 0 0 U1 0 1 Robot 0 1 1 e Robot 1 * (a) pO 0 A 0 U1 0 1 @ * ~ U2 6 Robot 0 C1 10 ( C2 1 Robot 1 (b) Figure 3-2: From single robot pose graph to multiple relative pose graphs (continued in Figure 3-3). For visualization, we use the factor graph representation, which directly corresponds to the entries in the measurement Jacobian as described in [11]. Small black discs represent measurements (factors), and larger blue shaded circles represent variables. The lines indicate dependencies, where black lines are used for normal pose constraints and red lines for encounters. (a) Two pose graphs for two robots without encounters. Each trajectory is anchored by a prior pr on its first pose that can be chosen arbitrarily (typically chosen to be the origin). For simplicity, we omit the concept of loop closing constraints here. (b) Two encounters expressed as additional constraints connecting the pose graphs of the two robots. Note that encounters can be based on one robot observing the other (yielding a synchronized constraint such as ci), or more generally by a common observation of the environment detected for example by scan matching (yielding a constraint that connects poses created at arbitrary times). .. ......... ...... ... ....... . ............. . . .... . ...... ............. Robot 0 A 0 Robot I (a) Robot 0 12 C C3 Robot 1 C2 Robot 2 Time (b) Figure 3-3: From single robot pose graphs to multiple relative pose graphs (continued from Figure 3-2). (a) The same encounters as in Figure 3-2b, but using our relative formulation. We introduce anchors A' for each trajectory that specify the offset of the trajectory with respect to a common global frame. Each encounter measurement now additionally connects to the anchors of both trajectories. (b) Our relative pose graph formulation generalizes to more than two robots, and does not require the pose graphs to be synchronized or even to start at the same time or location. This example shows three encounters between three robots. origin. Measurements between poses of a single trajectory are of one of two types: The most frequent type of measurement connects successive poses, and is based on odometry measurements or scan matching. Another type of measurement connects two arbitrary poses, representing loop closing constraints. Although supported in our implementation, for simplicity we omit loop closing constraints from the presentation here and show only constraints u between successive poses x[_ I and x in the example in Figure 3-2a. In our discussion so far the robot trajectories are completely independent, which we now change by introducing the concept of encounters. An encounter c between two robots r and r' is a measurement that connects two robot poses x and xr in Figure 3-2b. An example is shown in the figure, with the dependencies between measurement and poses shown as red lines. Note that we do not make any assumptions about synchronization here. In particular, measurement ci in Figure 3-2b connects poses taken at the same time, which can for example arise from an observation of one robot by the other. But more generally, an observation can connect poses taken at different times and can, for example, be derived from scan matching of observations taken at arbitrary times. An example is the measurement c2 in Figure 3-2b. We take a probabilistic approach for estimating the actual robot trajectories based on all measurements. The joint probability of all pose variables X = {x }Rand priors Z {u}R-0 i R-1 P(X, Z, C)Oc and N encounters C= rrR-o H 7 {c}I Mr Mr M N i=1 j=1 lpr) HP(xrlxr 1,Ui)) Q P (P(x r=0 r0, measurements is given by xx i where the data association (ij, i', rj, r ) for the encounters is known. We assume Gaussian measurement models, as is standard in the SLAM literature. The process model fi(x_ 1 ,u) x w (3.1) describes the odometry sensor or scan-matching process, where w is a normally distributed zero-mean process noise with covariance matrix A . The Gaussian measurement equation r , r' x. = hj(xi, , c) + vj (3.2) models the encounters between robots, where vj is a normally distributed zero-mean measurement noise with covariance Fj. We solve the least-squares formulation that corresponds to the maximum a posteriori (MAP) estimate X* for the robot trajectories X X = arg maxP(X, Z, C) = arg min -log P(X, Z, C) x x (3.3) Combining Equation 3.3 with the measurement models Equation 3.1 and Equation 3.2, and choosing covariance E for the priors, leads to the following nonlinear least-squares problem: X arg min { ||pr - + Yx| r=0 = 2) N hj(x ,c) 1 l (3.4) - -' 11J where we use the notation |eI =7eTE-le for the squared Mahalanobis distance with covariance matrix E. We solve the non-linear least-squares problem (Equation 3.4) using the incremental smoothing and mapping (iSAM) algorithm [22]. If the process models and constraint functions are nonlinear and a good linearization point is not available, nonlinear optimization methods are used, such as the Gauss-Newton or the Levenberg-Marquardt algorithm, which solve a succession of linear approximations to Equation 3.4 to approach the minimum. Linearization of measurement equations and subsequent collection of all components in one large linear system yields a standard least-squares problem of the form 9* = argmin |A& - b|| 2 0 (3.5) where the vector 0 E Rn contains the poses of all robot trajectories, where n is the number of variables. The matrix A E Rinxn is a large, but sparse measurement Jacobian, with m the number of measurements, and b E R'n is the right-hand side vector. iSAM solves this equation by QR factorization of the measurement Jacobian A = Q R (3.6) 0 where R E Rnxn is the upper triangular square root information matrix (note that the information matrix is given by RTR = ATA) and QE Rinxm is an orthogonal matrix, that is not explicitly stored in practice. Instead, the right-hand side vector b is modified accordingly during the QR factorization to obtain d E Rn. The solution is then obtained by back-substitution (3.7) RO =d To avoid refactoring an increasingly large measurement Jacobian each time a new measurement arrives, iSAM modifies the existing factorization to include the new measurement rows. The key to maintaining efficiency is in keeping the square root information matrix sparse, which requires choosing a suitable variable ordering. iSAM periodically reorders the variables according to heuristic and performs a batch factorization that also includes a relinearization of the measurement equations. More details can be found in [22]. Multiple relative pose graphs Initialization is a major problem of the approach to multi-robot mapping described so far. If we do not have an encounter at the beginning of the sequence, such as cl in the example in Figure 3-2b, then we have a gauge freedom on the second trajectory. The gauge freedom also requires us to anchor the second robot trajectory using a prior. However, we do not have a good initial estimate for that prior before we obtain a first encounter. Therefore, any choice is likely to conflict with the first encounter once it is added. Removing the prior at that point is also problematic, as the variable estimates can be far from a good linearization point, with the likely consequence of getting stuck in a local minimum. To solve the initialization problem, we introduce the concept of an anchor. The anchor A' for robot trajectory r specifies the offset of the complete trajectory with respect to a global coordinate frame. That is, we keep the individual pose graphs in their own local frame, which is typically anchored at the origin with a prior, as shown in Figure 3-3a. Poses are transformed to the global frame by Ar ( x , where we use the notation e from Lu and Milios [32] for pose composition. When operating in the 2D plane, composing a pose b with a difference (odometry) d is defined as Xb + Xd COS 6b - yd sineb p=be d = yb+xd sineb+yd cosb tb + td I (3.8) and the reverse transformation e expresses p locally in the frame of b (odometry from b to p) (Xp -Xb) COS b - (yp -Yb) sin eb -(xp -xb) sinOb -±(p-yb)cosb d =peb I (3.9) Op - eb The formulation of encounter measurements in the relative formulation differ from the global frame, as they now involve the anchors. As shown in Figure 3-3a, the encounter now references the anchor nodes of both pose graphs. That makes sense, because the encounter is a global measure between the two trajectories, but the pose variables of each trajectory are specified in the robot's own local coordinate frame. The anchor nodes are used to transform the respective poses of each pose graph into the global frame, where a comparison with the measurement becomes possible. The measurement model h is accordingly modified to also include the two anchor variables: h' (x , c1 , Ar, An). The difference c between a pose from trajectory rand a pose xr from another trajectory r' is given by c= (A'e x[) e (A' e x). The respective term of the global formulation (Equation 3.4) is therefore replaced by 2 N S h'(x j=1 (3.10) , ArJ, Ar) - x Fj Our concept of relative pose graphs generalizes well to a larger number of robot trajectories. An example with three pose graphs is shown in Figure 3-3b. Note that the number of anchor nodes depends only on the number of robot trajectories. However, the overall system again faces a gauge freedom that is resolved by adding a prior to one of the anchor nodes. In practice, we only add anchor nodes when they are needed, and add a prior to the first such node. 3.2.2 Summary Using relative pose graphs and anchor nodes, our algorithm provides efficient and consistent solutions and scales well to the number of robots. It facilities integration of information from multiple robots in a real time fashion. Extensive test and evaluation of our algorithm are presented in next chapter. Chapter 4 Experimental results In this chapter, we illustrate the performance of our algorithm discussed in the previous chapter with a variety of data sets. These include a laser data set collected in a complex indoor environment and a laser data set acquired jointly by a helicopter and a ground robot, as well as a publicly available multi-robot laser data set provided by Howard [18]. An overview of the properties of the data sets is given in Table 4.1. For our experiments, we use a scan matcher (discussed in Section 2.6.1) that remembers a short history of scans corresponding to key frames, fits contours to the data, and for a new scan, finds a globally optimal alignment within a given search region. For our results, we show re-projection of the laser measurements instead of evidence grids, as evidence grids can sometimes hide inaccuracies of the map, such as a duplication of a wall. Performance evaluation for each data set is also provided in this chapter. 4.1 Stata Center data set We present results based on recorded data from the third floor of our building, the Stata Center. The building shape differs from traditional office buildings, yielding its own challenges for robot mapping. The data was recorded with a single robot, and is here split into two segments that are assumed to be created by independent robots. We omitted several scans around the cutting point to ensure the trajectories are independent. Encounters between robots (black lines in Figure 4-1, 4-3) as well as loop closing constraints (red lines in .. ......... ... ........................... -iiiiik - - - - -. iii ,, 0- -- -J L-~ Figure 4-1: Stata Center data set: sequences of map merging. (a) Maps before the first encounter shown. The alignment between the two maps is arbitrary as the initial relative position between the robots is unknown. Robots start at arbitrary initial position and have no knowledge about the other robot. (b)The first encounter is detected shown as black line. (c) The two maps are merged. - - -- .. ::::: . ........ ....... ............. ............ ........ - ...... .... - - -- .. .... .......... ---U-- - - - Number of robots Number of scans Number of poses Number of encounters 2 11473 881 43 3 77478 840 12 [Loading Dock 2 42320 386 7 Entries/columns in R 7.4 5.6 5.1 Mapped area Trajectory length Optimization time 100m x 55m 278m 2.3s 40m x 28m 398m 1.4s 45m x 60m 106m 0.3s I Stata Center I Fort AP Hill - - - - Table 4.1: Overview of the data sets and their properties. The original number of scans is reduced by the key frame process of the scan matcher to a smaller number of poses that eventually is used in the pose graph optimization. The optimization time includes obtaining a full solution after every constraint is added. Figure 4-2: Stata Center data set: intermediate map overlaid on a floor plan provided by MIT facilities. - ::: - ::: 1-111W ..................... >1 (b) Figure 4-3: Stata Center data set: the resulting map with over 11000 laser scans recorded in about 40 minutes, covering an area of approximately 100m x 55m. We split the data into two parts to simulate two robots traversing the same environment. (a) The merged map based on our algorithm. The blue part of the map is created by one robot, the green part by the other. (b) Encounters are visualized by plotting time as elevation of the robot trajectory. The vertical red lines show loop closing constraints within a single robot trajectory, while the vertical black lines show the encounters between the trajectories. Figure 4-3) within a trajectory are generated by scan matching. For the first encounter, we search the best matching scans among all scans up to the time. After that, an encounter is added if the robots are within 4m from each other, and more than 85% of the scans overlap. Figure 4-1 shows sequences of map merging starting from a map before the first encounter and right after the first merging. Figure 4-2 is the map overlaid on a floor plan provided by MIT facilities, showing the accuracy achieved. Figure 4-3 shows the resulting map created by our algorithm as well as the encounters. 4.2 Fort AP Hill data set We have applied our algorithm to a publicly available multi-robot data set recorded at Fort AP Hill, with the results shown in Figure 4-4, 4-5, 4-6. This data set was obtained from the Robotics Data Set Repository (Radish) [19]. Three robots travel through the same building on different routes, covering different parts of the building, but with significant overlap, as shown by the maps of the individual robots in Figure 4-5. The data set includes encounter information based on fiducial, however, since all the robots start at the same position of the map (bottom right) there are only few encounters later, and we used scan matching instead to define encounters. Encounters are defined based on the same criteria as for the previous data set. Figure 4-4 shows sequences of building a map. Each robot starts at arbitrary initial position without knowing the other robot locations until the first encounter. Our algorithm produced the combined map shown in Figure 4-6a. We show a 3D view of the planar map in Figure 4-6b with the encounters as vertical black lines between robot trajectories plotted by using time as elevation. 4.3 Helicopter and ground robot data set Results for cooperative mapping between a quadrotor helicopter and a ground robot are shown in Figure 4-8 and 4-9. This heterogeneous setup allows mapping of areas that are not fully accessible to either type of robot. While the ground robot can easily get into smaller .. --. MMIN ... ...................................... --.- . M. -..................... ... ... ... .. .- OP- ./ I2JL~j~ * IA. tT7 ~J.Ljj f N ~f]~\\ N --- =~-'-----2z - L: i~I (c) Figure 4-4: robot starts right robot. between all Fort AP Hill data set with three robots: sequences of map merging. (a) Each at arbitrary initial position. (b) After first encounter between the left and the Two of the three maps are aligned. (c) All maps are aligned after encounters three robots. 7..i (b) ii17 -. \-----~-~~ 1 Figure 4-5: Fort AP Hill data set with three robots: the individual maps from each of the three robots. 60 (a) (b) Figure 4-6: Fort AP Hill data set with three robots: the resulting map. (a) Map based on the output of our algorithm, using a different color for each of the three robots. (b) The encounters between the three robots are visualized by plotting time as elevation of the robot trajectory. The vertical black lines show encounters. ........ ............................. . ................ . ...... ....... Figure 4-7: Cooperative mapping using a quadrotor and a ground robot. Encounters are generated using a camera on the quadrotor to detect the checkerboard pattern on the ground robot and are refined by scan matching. 62 ....... ..... .......................................... ... $ I fl4~ Figure 4-8: Cooperative mapping using a quadrotor and a ground robot: individual maps. (a)-(b) The robots are shown in the left column. Individual maps obtained by each robot are shown on the right column. (c) An enlarged part of the map at the time of an encounter, overlaid on a floorplan. Data from the ground robot is shown in blue, for the quadrotor in red, and the encounters are shown in green. See Figure 4-9 for the full map and more details. .. .................. . ............... .......................... .. ...... ....... . ...... ... ... ....... . ............. . . (a) Figure 4-9: Cooperative mapping using a quadrotor and a ground robot: the resulting map. Encounters (green) are generated using a camera on the quadrotor to detect the checkerboard pattern on the ground robot and are refined by scan matching. (a) Merged map obtained from our algorithm, overlaid on a floor plan that was provided by MIT facilities. (b) Panoramic view of the loading dock environment. -...... .... - ............. .. E 60 + 50 - ___- global Z 4relative 30 S 20E 100 Z - __.MMOOMM.. ....... 0 Stata center Fort AP Hill Loading dock Figure 4-10: Comparison of convergence speed between relative and global parameterization, showing faster convergence for our relative formulation. spaces that are problematic for the helicopter, the helicopter can fly over obstacles and cross discontinuities in the floor level. The helicopter setup, control and 2D scan matching is described in Bachrach et al. [3]. We performed our experiment in the underground loading dock of the Stata Center that provides continuous hallways and spaces for the ground robot, and a large lower level that is accessible to only the helicopter. Encounters are found using vision to determine inter-robot observations as shown in Figure 4-7. A checkerboard pattern on the ground robot was detected from a camera on the helicopter. As the derived relative pose information is noisy, we used scan matching, both to refine the estimate, and to eliminate false positives. The individual maps generated from the ground robot and the helicopter are shown in Figure 4-8. The complete map shown in Figure 4-9a overlaid on a floor plan obtained from MIT facilities shows strong correlation between the learned map and ground truth. 4.4 Performance evaluation We compare convergence speed between the relative and the global parameterization in Figure 4-10. The results show that the relative formulation requires less iterations until the residual falls below a threshold (here 10-4). For iSAM we performed batch optimization every 100 steps, and in most cases a single iteration (reordering and relinearization) was sufficient for the relative formulation, while the global formulation required about twice as many iterations. The evolution of the uncertainty of one of the anchor nodes is shown in Figure 4-11. It ............. -- ......... . ............................................... .................. I'll ....................... .. ........ .......... . ..... 0.6 E 0.5 0.4 . 03 .!a > 1 1 2 1 1 Positional uncertainty (a + a ) - 0.2 01 0 100 200 300 400 500 Time step 600 700 800 900 Figure 4-11: Positional uncertainty of the origin of the second pose graph with respect to the origin of the first one for the Stata Center sequence. 0. 500 1000 1500- 2000 2500 0 500 1000 1500 nz = 19633 2000 2500 Figure 4-12: The square root information matrix at the end of the Stata Center sequence. The matrix remains sparse due to variable reordering, which is essential for an efficient solution. shows how the uncertainty of the origin of the second pose graph changes over time in the global reference frame. The global reference frame can be chosen arbitrarily, and in this case the origin of the first pose graph is chosen. Note that the anchor node is only created at the time of the first encounter (time step 160). The uncertainty remains constant until further encounters and loop closures within a single pose graph occur, for example at time step 231. Figure 4-12 shows the final square root information matrix of the Stata Center data set. The matrix is indeed sparse, with 19633 entries with side length 2646, yielding 7.4 entries per column. Experimental results and the evaluation in this chapter demonstrate the performance of . .... our algorithm and validate that the algorithm performs well with complex real world data under real time constraints. Chapter 5 Conclusion This thesis has presented a new approach to cooperative mapping that provides robust, real-time performance in complex environments. The algorithm was tested and validated on various real world data sets. Although the algorithm performed well, there are still areas that can be improved, especially related to existing challenges of the SLAM problem. Existing SLAM algorithms perform reasonably well in static indoor environments for short time durations. However, further research is required to enable operation in complex environments for long-term operations. In this chapter, we summarize our contributions, discuss challenges of current algorithms and suggest future directions. 5.1 Summary of contributions This thesis presents a new algorithm for cooperative multi-robot mapping, with the following properties. First, the algorithm solves the full multi-robot nonlinear SLAM optimization problem in real-time. Second, it provides fast convergence to the global solution, while preserving efficient access to covariances. Third, the algorithm accommodates multiple uncertain encounters between robots. Fourth, it preserves consistency, even in situations with cycles in the graph of encounters. The algorithm has been validated experimentally using several real data sets, demonstrating improved convergence speed in comparison to using a global parameterization. Our experimental results include indoor cooperative mapping by a ground robot and a he- licopter using visual encounters. Our relative formulation avoids the need to adjust individual pose graphs when they are combined. This is achieved using anchor nodes, which are mathematically equivalent to the "base nodes" first introduced by Ni et al. in the different context of large-scale batch submapping with the Tectonic SAM algorithm [38]. Anchor nodes make all the information from each robot immediately available as soon as a encounter takes place. Also, the information is integrated into a common reference frame, avoiding the need to adjust each pose graph or reprocess previous data from each robot. When the first encounter occurs, only the anchor nodes are updated, not entire pose graphs. As subsequent encounters occur, information can propagate between the two pose graphs similar to the loop closing case in a single robot system. However, there are limitations to our algorithm. Our approach does not take into account any issues of communication constraints between robots. In addition, the development of appropriate techniques for dealing with long-term changes in the environment is required for the algorithm to be beneficial for long-term persistent mapping, enabling the accumulation of large databases of relative pose graphs acquired over a robot's lifetime. Investigating incorporation of partial constraints for the robot-to-robot encounters, such as range only or bearing only measurements, is a potential extension of this work. These limitations are common to many existing SLAM algorithms. Such challenges and future directions are discussed in the next section. 5.2 Future work In this section, open challenges of the multi-robot SLAM problem are discussed, as well as applications of multi-robot mapping algorithm. Finally we discuss the integration of robot navigation algorithms so as to enable fully autonomous robots. 5.2.1 Open challenges of multi-robot SLAM Multi-robot SLAM problem has all the challenges of single robot SLAM while adding some of its own. These challenges must to be solved in order to have a full solution to the SLAM problem. Therefore they are discussed here and suggested as future research directions. Operating robustly in dynamic environments is one of them, and the system also has to maintain consistency, guarantee robustness and improve computational efficiency and complexity. The ability to navigate for a long duration (e.g. weeks or years) and large data set management are also important issues of the SLAM problem. In addition, reliable communication between robots and algorithm scalability are other challenges that only occur in multi-robot system. Dynamic environments Dealing with dynamic objects is difficult for robots because they are part of the map, are sensed by the robot and become part of the knowledge about the world. However, they are not consistently observed. When a robot comes back to where it traveled before and senses different objects, inconsistent information about the location is likely to cause confusion to the robot. In fact, this problem is not isolated to robots, changing environments are also challenging for humans. Suppose we have not visited a place for a long time and there has been large changes, it is hard to decide if we are at the same place or not. In spite of the inherent difficulties of the problem, solving this problem is crucial in order to have autonomous robots operating robustly in the real world full of dynamic objects. Previous work on dynamic environments includes [43] and [5]. Biber and Duckett [5] model temporal changes of local maps and maintain five different maps depending on different time scales. However, more extensive research is required for detection of the changes in a dynamic environment and integration of the new information with existing solutions. Robustness, computational efficiency, complexity and consistency It is reasonable to argue that a multi-robot system is relatively more robust to failures than a single robot system, but guaranteeing the robustness in various scenarios is challenging. Systemic failure recovery plans for different scenarios have to be carefully designed to guarantee the lower bound of the system's performance. Such plans may include the decision of the number of robots required to complete a mission or the modification of plans depending on the status of the system. Computational efficiency and low complexity is another challenge. iSAM provides efficient computation compared to other existing methods that provide consistent solutions. However, as multi-robot systems operate longer, and as more robots are added to the system, the required computational power increases while the resource is limited. Therefore, keeping efficient computation and low complexity is more important for multi-robot systems. Even if a solution is achieved using efficient computation, it is only meaningful if it is consistent. Keeping consistency often relates to considering all the data collected. This conflicts with improving computational efficiency and complexity. The trade off between these factors is challenging. Long-term mission and data structure management If we want autonomous robots that can operate with minimum or no human supervision, long-term operation challenges must be investigated. Inevitably the amount of collected data will grow as time increases, while the memory to store this data is limited. For longterm missions, it is often very hard to maintain an entire stream of data. One area for future work can be selecting the best information to store. This becomes a problem of deciding how to choose the most efficient and concise, yet compact representation of the world. There have been various types of spatial representations studied, but the optimal spatial representation for long-term operations has not been extensively studied. Not only does the algorithm have to have efficient spatial representations, but also such information has to be managed with efficient data structures. In other words, the data has to be efficiently indexed so that it can quickly be retrieved when needed. The benefit of efficient data structure would be significant in long-term operations, as we are dealing with large amount of data. Reliable communication Reliable communication is challenging by nature. We know from experience that there is often no guarantee of reliability of wireless or cell phone reception. Communication between robots in a real world application often cannot be guaranteed either. In fact, as losing communication may cause the failure of an entire system mission. Therefore, it is worthwhile to investigate multi-robot mapping with limited communication such as bandwidth limitation and intermittancy. Nerurkar et al. [36] investigated methods for managing communication cost and limiting complexity for large multi-robot teams performing cooperative localization using optimization techniques. However, this analysis was restricted to the case with limited computational requirements, where in real world, all different kinds of communication challenges exist. This topic also can be strongly related to the robustness of the multi-robot system. Scalability An additional challenge for multi-robot mapping is that the solution must scale well not only with the size of environments but also with the number of robots. A large number of robots was employed in [25] using a particle filter algorithm. However, the mapping is done by two robots and distributed to the others, as the number of map matching estimates increase exponentially to the number of robots. The ideal multi-robot algorithm should scale well to a large number of robots, even hundreds or more. 5.2.2 Future applications Multi-robot systems have various potential applications. They can range from automating jobs where currently humans are collaborating as a team to application that largely benefit from parallelism. Security, surveillance and target tracking Using multiple robots for security, surveillance and target tracking is another potential applications of the algorithm. To operate effectively these robots must localize and map their environment. Items of interest can be other mobile robots, items in a warehouse, people in search and rescue effort, or adversarial targets in surveillance and reconnaissance situations. There are also medical applications, such as moving cameras to keep the view .. .. .... .. ...... .. ..... ........... ........ (a) The rover platform ...... .......... (b) Collaboration of a forklift and a rover Figure 5-1: Heterogeneous multi-robot system collaboration. of the area of interest during an operation [29]. Collaboration among heterogeneous platforms A multi-robot system can consist of robots with different sensors, capabilities and purposes. Collaboration among these robots can be another application of multi-robot mapping algorithms. For example, if a system has "leader" robots with more accurate sensors and "follower" robots with less accurate sensors, the knowledge collected from "leader" robots can be used to leverage the knowledge of "follower" robots. As an example of heterogeneous systems, Agile robotics project at MIT is building an autonomous forklift (Figure 5-1b left) with multiple sensors and smaller robots (one of them is shown in Figure 5-la) with fewer sensors. The robot in Figure 5-la is called a rover. The role of the rovers is to work collaboratively together, as well as collaborating with the forklift as watchdogs. For example, rovers can provide extra information for navigation or help challenging maneuvers such as navigating a narrow corner or lifting an unstable pallet. .. .. _ .. . ..... .. . ..... .. .. ........ ............ .. I.. . Figure 5-2: Integration of robotics navigation algorithms is necessary for fully autonomous navigation. 5.2.3 Towards fully autonomous navigation This thesis focuses on a section of robot navigation, which is only a component of a complete autonomous robot system. To achieve the full functionality for autonomous navigation, integrating SLAM with path planning is necessary. Figure 5-2 depicts a simplification of active navigation research areas in robotics. For example, SLAM algorithms perform mapping and localization at the same time, and exploration algorithms perform mapping and path planning. Integrating separate areas of research is challenging, as fully understanding each of them is difficult. However, it is necessary to move towards fully autonomous navigation systems, indicated by the blue star located in the center of the figure. Bibliography [1] Website. http://www.paradigmhelicopters.com/you-fly-police-missions.php. [2] L.A.A. Andersson and J. Nygards. C-SAM : Multi-robot SLAM using square root information smoothing. In IEEE Intl. Conf on Robotics and Automation (ICRA), pages 2798-2805, 2008. [3] A. Bachrach, R. He, and N. Roy. Autonomous flight in unstructured and unknown indoor environments. In EuropeanMicro Aerial Vehicle Conference, Sep 2009. [4] T. Bailey and H. Durrant-Whyte. Simultaneous localization and mapping: Part II. IEEE Robotics and Automation Magazine, pages 108-117, June 2006. [5] P. Biber and T. Duckett. Dynamic maps for long-term operation of mobile service robots. In Proc. of Robotics: Science and Systems (RSS), 2005. [6] W. W. Cohen. Adaptive mapping and navigation by teams of simple robots. Robotics And Autonomous Systems, 18:411-434, 1996. [7] I. J. Cox and G. T. Wilfong. Autonomous Robot Vehicles. Springer-Verlag, 1990. [8] M. Cummins and P. Newman. Highly scalable appearance-only SLAM - FAB-MAP 2.0. In Robotics: Science and Systems (RSS), Seattle, USA, Jun 2009. [9] T. Davis, J. Gilbert, S. Larimore, and E. Ng. A column approximate minimum degree ordering algorithm. ACM Trans, 30:353-376, 2004. [10] G. Dedeoglu and G. Sukhatme. Landmark-based matching algorithm for cooperative mapping by autonomous robots. In DistributedAutonomous Robotics Systems, pages 251-260. Springer-Verlag, 2000. [11] F. Dellaert. Square Root SAM: Simultaneous location and mapping via square root information smoothing. In Robotics: Science and Systems (RSS), 2005. [12] H. Durrant-Whyte and T. Bailey. Simultaneous localization and mapping: Part I. IEEE Robotics andAutomation Magazine, pages 99-108, June 2006. [13] J. Fenwick, P. Newman, and J.J. Leonard. Cooperative concurrent mapping and localization. In IEEE Intl. Conf on Robotics and Automation (ICRA), volume 2, pages 1810-1817, 2002. [14] A. Gelb, editor. Applied Optimal Estimation. MIT Press, 1974. [15] J.S. Gutmann. Robuste Navigationautonomer mobiler Systeme. PhD thesis, University of Freiburg, 2000. [16] J.M.M. Montiel H. Strasdat and A.J. Davison. Real-time monocular SLAM: Why filter? In IEEE Intl. Conf on Robotics and Automation (ICRA), May 2010. [17] R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2000. [18] A. Howard. Multi-robot simultaneous localization and mapping using particle filters. Intl. J. of Robotics Research, 25(12):1243-1256, 2006. [19] A. Howard and N. Roy. The robotics data set repository (radish), 2003. [20] S. J. Julier. A sparse weight Kalman filter approach to simultaneous localisation and map building. In Sensor Fusion and Decentralized Control in Robotic Systems IV. SPIE, 2001. [21] M. Kaess, V. Ila, R. Roberts, and F. Dellaert. The Bayes tree: Enabling incremental reordering and fluid relinearization for online mapping. Technical Report MITCSAIL-TR-2010-021, Computer Science and Artificial Intelligence Laboratory, MIT, Jan 2010. [22] M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental smoothing and mapping. IEEE Trans. Robotics, 24(6):1365-1378, Dec 2008. [23] B. Kim, M. Kaess, L. Fletcher, J. Leonard, A. Bachrach, N. Roy, and S. Teller. Multiple relative pose graphs for robust cooperative mapping. In IEEE Intl. Conf on Robotics and Automation (ICRA), Anchorage, Alaska, May 2010. [24] J. Ko, B. Stewart, D. Fox, K. Konolige, and B. Limketkai. A practical, decisiontheoretic approach to multi-robot mapping and exploration. In IEEE/RSJ Intl. Conf on Intelligent Robots and Systems (IROS), pages 3232-3238, 2003. [25] K. Konolige, C. Ortiz, R. Vincent, A. Agno, B. Limketkai, M. Lewis, L. Briesemeister, D. Fox, J. Ko, B. Stewart, and L. Guibas. Centibots: Large scale robot teams. In In AAMAS, 2003. [26] D. Kortenkamp, R. P. Bonasso, and R. Murphy, editors. Artificial Intelligence and Mobile Robots. AAAI Press, 1998. [27] S. Kullback. Information theory and statistics. John Wiley and Sons, NY, 1959. [28] R. Kurazume, S. Nagata, and S. Hirose. Cooperative positioning with multiple robots. In In Proc. of the IEEE/RSJ InternationalConference on IntelligentRobots and Systems (IROS), pages 1250-1257, 1994. [29] S. Lavalle, C. Becker, and J. Latombe. Motion strategies for maintaining visibility of a moving target. In In Proc. of the IEEE InternationalConference on Robotics and Automation (ICRA), pages 731-736, 1997. [30] J. J. Leonard and H. F. Durrant-Whyte. Mobile robot localization by tracking geometric beacons. IEEE Trans. Robotics and Automation, 7(3):376-382, June 1991. [31] J. J. Leonard and H. F. Durrant-Whyte. Simultaneous map building and localization for an autonomous mobile robot. In Proc. IEEE Int. Workshop on Intelligent Robots and Systems, pages 1442-1447, Osaka, Japan, 1991. [32] F. Lu and E. Milios. Globally consistent range scan alignment for environment mapping. Autonomous Robots, pages 333-349, Apr 1997. [33] M. J. Mataric. Coordination and learning in multirobot systems. IEEE Intelligent Systems, 13(2):6-8, 1998. [34] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM: A factored solution to the simultaneous localization and mapping problem. In Proceedings of the AAAI National Conference on Artificial Intelligence, Edmonton, Canada, 2002. AAAI. [35] A. I. Mourikis and S. I. Roumeliotis. Predicting the accuracy of Cooperative Simultaneous Localization and Mapping. InternationalJournalof Robotics Research, 25(12):1273-1286, December 2006. [36] E. D. Nerurkar, S. I. Roumeliotis, and A. Martinelli. Distributed maximum a posteriori estimation for multi-robot cooperative localization. In IEEE Intl. Conf on Robotics and Automation (ICRA), pages 1402-1409, May 2009. [37] E. W. Nettleton, H. F. Durrant-Whyte, P. W. Gibbens, and A. H. Goektogan. Multipleplatform localization and map building. In Proc. SPIE, Sensor Fusion and Decentralized Control in Robotic Systems III, volume 4196, pages 337-347, 2000. [38] K. Ni, D. Steedly, and F. Dellaert. Tectonic SAM: Exact, out-of-core, submap-based SLAM. In IEEE Intl. Conf on Robotics and Automation (ICRA), pages 1678-1685, Apr 2007. [39] M. Paskin. Thin junction tree filters for simultaneous localization and mapping. In Proceedings of the 18th InternationalJoint Conference on Artificial Intelligence, pages 1157-1164, San Francisco, CA, 2003. [40] I. M. Rekleitis, G. Dudek, and E. E. Milios. Multi-robot exploration of an unknown environment, efficiently reducing the odometry error. InternationalJoint Conference on Artificial Intelligence, 15:1340-1345, 1997. [41] S.I. Roumeliotis and G.A. Bekey. Distributed multi-robot localization. IEEE Transactions on Robotics and Automation, 18(5):781-795, October 2002. [42] R. Smith, M. Self, and P. Cheeseman. Estimating Uncertain Spatial Relationshipsin Robotics. Autonomous Robot Vehicles. Springer-Verlag, 1990. [43] C. Stachniss and W. Burgard. Mobile robot mapping and localization in non-static environments. In In Proc. of AAAI, pages 1324-1329. AAAI, 2005. [44] J.D. Tard6s, J. Neira, P.M. Newman, and J.J. Leonard. Robust mapping and localization in indoor environments using sonar data. Intl. J. ofRobotics Research, 21(4):3 11330, Apr 2002. [45] S. Thrun, W. Burgard, and D. Fox. A real-time algorithm for mobile robot mapping with applications to multi-robot and 3d mapping. pages 321-328, 2000. [46] S. Thrun, W. Burgard, and D. Fox. ProbabilisticRobotics. MIT Press, 2005. [47] S. Thrun and Y Liu. Multi-robot SLAM with sparse extended information filters. Proc. of the Intl. Symp. of Robotics Research (ISRR), 15:254-266, 2005. [48] S. Thrun, Y Liu, D. Koller, A.Y Ng, Z. Ghahramani, and H. Durrant-Whyte. Simultaneous localization and mapping with sparse extended information filters. International Journalof Robotics Research, 23(7-8):693-716, July-August 2004. [49] S. Thrun and M. Montemerlo. The graph SLAM algorithm with applications to largescale mapping of urban structures. InternationalJournal of Robotics Research, 25(56):403-429, 2006. [50] B. Triggs, P.F. McLauchlan, R.I. Hartley, and A.W. Fitzgibbon. Bundle adjustment - A Modem Synthesis. In Proceedings of the International Workshop on Vision Algorithms: Theory and Practice,page 372. Springer-Verlag, 1999. [51] S. Williams. Efficient solutions to autonomous navigation and mapping problems. PhD thesis, University of Sydney, 2001.