c 2005 Institute for Scientific ° Computing and Information INTERNATIONAL JOURNAL OF INFORMATION AND SYSTEMS SCIENCES Volume 1, Number 3-4, Pages 364–371 A METHOD OF MULTI-ROBOT FORMATION WITH THE LEAST TOTAL COST MEIPING SONG, GUOCHANG GU, RUBO ZHANG, AND XINGCE WANG Abstract. A method of formation used in the “pursuit-evasion” game is presented in this paper. It reduces the computational complexity efficiently with the least total cost. This method takes the advantages of adaptation and cooperation in multi-robot, and introduces the roles of real leader and virtual leader. The method can work well with any type of formation, especially with complex ones. It can overcome the problem of unnecessary total cost in the fixed role formation when rotating wholly, and decrease the complexity of role assignment in formation caused by the number of robots. The efficiency of the method is illuminated from the viewpoints of geometry, complexity and simulation. Key Words. formation, multi-robot, role assignment, the least total cost. 1. Introduction Today the research of robotics has been focused on multi-robotics, and the most commonly used platforms are formation, pursuit-evasion and Robot world cup. These all need the robots to cooperate to form a particular shape to achieve some special task. For some special cooperative task, the efficiency of the formation directly influences the performance of systems, such as maritime search and rescue, and searching and driving of an invader in a certain environment. Therefore a good method of formation is critically important for the efficiency of multi-robot cooperation. There have been several formation methods, such as time-optimal [1], formation vector [2] and adaptive team formation [3]. These algorithms were concerned much with the problems of time and the adaptability of formation, but little with the total travel cost of multi-robot. When robots perform a task, the problem of energy supply is still underlying. So, a method requiring the least cost is indeed necessary. On the other hand, when multi-robot cooperation in the dynamic environment on line, the real-time quality is important. A good method of formation should solve the problem emerged in formation without ignoring the real-time quality. Therefore, a distributed optimal formation method is proposed in this paper, which improves the adaptability of formation without reducing the real-time quality of system. It decreases the total travel cost of multi-robot greatly at the same time. The efficiency of the method is illustrated from the viewpoints of geometry, complexity and simulation. The paper is organized as follow. The shortcomings of fixed role formation in “pursuit-evasion” game are detailed firstly. Secondly, a new method with the least total cost is prompted, and the efficiency is illustrated. Lastly, the simulation results and conclusions are shown. Received by the editors June 1, 2004 and, in revised form, January 22, 2005. 364 A METHOD OF MULTI-ROBOT FORMATION WITH THE LEAST TOTAL COST 365 2. Motivation In “pursuit-evasion” games, the common methods to construct robots’ behavior are based on the form of graph [4][5] or grid [6][7]. In the graphical case, the environment is known in advance and the games are mostly based on visibility, that is, the task is achieved once the invader is detected[8][9]. In this case, there is no need for robots to form a special shape to catch the evader. But in the grid case, the environment is often unknown, and the pursuing robot should occupy most even all of the cells around the evader so as to prevent from it moving. The state of “success” in grid game is shown in figure (1), where the black squares represent pursuers and the black circle corresponds to the evader. Every robot has five available motions: stop, left, right, up and down. So, in the state shown in figure (1), the evader cannot move any more, and the task is achieved. Figure 1. The state of success in grid form In grid games, pursuers should cooperate to achieve a shape of formation eventually. According to the model of task and the state of evaders, pursuers are partitioned into several different groups, which corresponds to particular roles. This is what being investigated in the field of task decomposition and role assignment. The fixed role assignment costs less calculation and can guarantee the success and stability of formation, but it brings about the problems of unnecessary total cost at the same time. The problem is more obvious in tracing a target. There are four roles in the task of figure (2): head, left, right and tail respectively. In the rhombic formation of figure (2(a)), robot R1 is the fixed head, R2 the left, R3 the right, and R4 the tail. When the path changes the corner as figure (2(b)) shows, the formation will have to change into figure (2(c)) accordingly. And then this brings about more unnecessary total cost than the situation in figure 2(d)). The cause of the above problem is the fixed roles in the process, which don’t fully take advantage of the cooperation in multi-robot. If we transfer the head role to R3 , and the other roles clockwise to the corresponding robots in the corner of the path, the situation will be more appealing. And the ideal situation shown in figure (2(d)) will be achieved. Then, what attracts our attention is how to realize role transition among robots in the distributed system. The direct method is to 366 M. SONG, G. GU, R. ZHANG, AND X. WANG Figure 2. Analysis of fixed role formation and dynamic role assignment match all the roles with all the robots respectively, and then selecte the best one with some predefined criterion. But the scale of this method increases disastrously with the number of robots. The method prompted here deals with this problem perfectly. That is what to be discussed in the next section. 3. Optimal policy of formation with the least cost 3.1. Formulation. In order to construct continuous behavior for robots and allow drive of evader, an extended version of grid “pursuit-evasion” game is used. The original description of environment and task model can be found in [10]. Here, the state of “success” is defined in figure (3) and there could be some polynomial obstacles in the unknown environment just as in figure (5). From figure (3), it can be seen that the pursuers should form a regular polygon, which is the inscribed one of the predefined catching circle using the evader as center. Ri represents the ith pursuer, i=1,. . . ,n, where n is the number of required pursuers. RT is the evader. r is the radius of catching circle. And it is determined by the safe distance df between robots. That is to say, (1) r = df /(2 × sin(π/n)). This makes the formulation flexible for various scales of problems. When the number of required pursuers changes, the radius of catching circle changes correspondingly, which will not influence the role assignment method in this paper. By now, the problem of task decomposition in formation is solved, and next we will go to the problem of role assignment. 3.2. Algorithm description. The primary idea of the role assignment technique in this paper is that, the robot whose policy causes the least total cost acts as the A METHOD OF MULTI-ROBOT FORMATION WITH THE LEAST TOTAL COST 367 Figure 3. The state of success in extended version real leader in each cycle. At the same time, the others act as assistants who fetch their sub-tasks assigned by the real leader from the blackboard system and reset their targets to shape a new pattern. The process can be described as follows. Firstly, each robot acts as a virtual leader and makes its own local decision regarding the current situation independently. Each decision is considered to cause the least cost of both itself and total distance from the viewpoint of its possessor. Secondly,all the locally optimal policies are sent to the blackboard, where their values of total cost are compared and the globally optimal one is selected eventually. At last, the decision maker of the winning policy acts as the real leader, and the others act as its assistants in the performance of this cycle. All the robots receive their new roles and continue the task. The instance in figure (4) is taken as an example to illustrate the algorithm prompted here. There are 6 pursuers in the “pursuit-evasion” task, and the expected shape is the regular hexagon centered on the evader. Figure 4 shows the local optimal decision of virtual leader R2 . RT is the evader, Ri is the ith pursuer, i=1,. . . ,6. T0 labelled by black triangle is the nearest point to R2 on the boundary of catching circle. Because of selfishness, R2 would prefer to take T0 as its subtarget, and then decompose the task based on T0 locally. Tj is the jth sub-target determined by R2 according to the point of T0 , j=1,. . . ,5. That is formula (2), (2) angle(Tj−1 RT Tj ) = 2 × π/n. The decision process of R2 in the distributed system can be described as the following. Step 1: R2 determines its sub-target T0 labelled by black triangle on the crossover of its joining line to RT and the catching circle centered on RT as well. The other five sub-targets are located based on T0 and RT (or any other reference point of the formation) to form a regular hexagon centered on RT . The resulting order set of the other sub-targets is T ={T1 , T2 , T3 , T4 , T5 }. Step 2: Build up a polar coordination system that has the pole RT and the polar −−−→ axis RT R2 . Compute the polar angles θi of the other pursuers in this coordination system, i=1,3,4,5,6. Step 3: Arrange the other pursuers so that their corresponding polar angles θi are in an ascending order using any type of sorting method such as Bubble sort. The final order set in figure (4) can be represented as R={R6 , R4 , R1 , R3 , R5 }. 368 M. SONG, G. GU, R. ZHANG, AND X. WANG Figure 4. The local optimal decision of virtual leader R2 Step 4: Match the corresponding elements in the sub-targets order set T and pursuers order set R in turn, and then the locally optimal decision of R2 is accomplished. Each pursuer corresponds to a sub-target in this local policy. Step 5: Put this local optimal decision result into the blackboard system, select the global optimal decision and then the real leader is determined according to the criterion of the least total cost. Step 6: According to the real leader’s policy, each robot may fetch its new role and sub-target through the blackboard system, modify the position of sub-target (detailed next) and forms in the new pattern. Then the transfer of leadership is accomplished. Note that because of the fact that there are obstacles in the environment, some of the original sub-targets may fall into the area occupied by the obstacles. If the sub-target of a certain pursuer is in the obstacle, it should withdraw the subtarget along the radius vector shown in figure(5), until the sub-target is safe for the robot to avoid collision with the obstacle. When obstacles are so close that the safe distance can’t be satisfied, the robot will head in a line formation. In the figure, the dark polygons are static obstacles, and Ti0 is the real sub-target after modification, i=2,3,5,6. This technique makes the shape of formation flexible to the environment, especially when the obstacles are dense. 3.3. Brief proof. Next we briefly prove that the combination of the dotted line R1 → T3 or R3 → T4 in figure(4) is better than that of R3 → T3 and R1 → T4 with the principle of least total cost from the viewpoint of geometry. These generally represent different types of role assignment, and the former is what implied by the method in this paper. Firstly, the former avoids the crossover of lines, which means avoiding the path conflict between R1 and R3 implicitly. To resolve the conflict in multi-robot system, it needs not only the computation work of replanning but also the communication between the involved robots. This will degrade the real-time performance of the A METHOD OF MULTI-ROBOT FORMATION WITH THE LEAST TOTAL COST 369 Figure 5. Modification of target when in the obstacle system, especially in the situation that communication is difficult, such as underwater. Secondly, there are the following four relationships for the triangles 4R1 OT3 and 4R3 OT4 in terms of geometry. (3) kR1 Ok + kOT3 k > kR1 T3 k. (4) kR3 Ok + kOT4 k > kR3 T4 k. (5) kR1 Ok + kOT4 k = kR1 T4 k. (6) kR3 Ok + kOT3 k = kR3 T3 k. Then there is (7) kR1 T4 k + kR3 T3 k > kR1 T3 k + kR3 T4 k. The inequality(7)can be drawn from the above four ones. It can then be proved that the current decision is the locally optimal one on the analogy of this. On the other hand, step 5 of the algorithm avoids the local optimization, which takes the advantage of cooperation between robots. So the final decision would be the globally optimal one. 3.4. Analysis of complexity. This algorithm introduces the roles of real leader and virtual leaders. Given n robots, the most costly work of each virtual leader is composed by two parts. The first one is to make its local optimal decision, which is actually sorting the (n-1) polar angles and matching them with the corresponding pursues. The other is to detect the globally optimal policy within the n policies of all virtual leaders’. The time complexity for this process is O(n2 ) when Bubble sort method is used. 370 M. SONG, G. GU, R. ZHANG, AND X. WANG But without virtual leaders and a real leader, the robot should firstly repeat combining the n robots with the n sub-targets in (n!) possible patterns, and (n!) policies are established. Then the robot would select the global one from these (n!) policies. The time complexity for this process is O(n!). It can be drawn consequently that the algorithm prompted in this paper can reduce time complexity greatly. 4. Simulation and conclusions The feasibility of this algorithm can be easily tested by a simulation system. The result is that the robots transfer the power of leading in the situation shown in figure (2), and come up with the shape shown in figure (2(d)). In condition that the obstacles are dense, the robots can modify the shape of formation to go through them safely, just as shown in figure(6). Figure 6. The simulating result It can be concluded that this method can avoid unnecessary total costs caused by wholly rotating through changing roles according to the dynamic reference point and environment flexibly. Secondly, it reduces the computations greatly in contrast to the case without a leader. Thirdly, this method overcomes the problem of combination explosion caused by the increasing number of robots, and is more robust than the method of matrix selection because it can avoid the final unexpected awful result [1]. This method can be easily extended to complex shape of formation, but its advantage is not obvious with the linear shape. References [1] Dong, S.L., Xi, Y.G. and Chen, W.D., Dynamic Optimization for Multi-robot Uncertain Cooperative Mission, Robot, vol.24, no.1, pp.31-34, 2002. [2] Bo, X.Z. and Hong, B.R., A Behavior of Cooperative Formation for Multi-agent Robotic System, Journal of Astronautics, vol.22, no.3, pp.38-44, 2001. [3] Wang, X.C., Zhang, R.B. and Guo, G.C., Research on Multi-agent Team Formation Based on Reinforcement Learning, Computer Engineering, vol.28, no.6, pp.15-16, 2002. [4] Suzuki, I. and Yamashita, M., Searching for a Mobile Intruder in a Polygonal Region, SIAM J. Computing, vol.21, pp.863-888, 1992. [5] Isler, V., Kannan, S. and Khanna, S., Locating and Capturing an Evader in a Polygonal Environment, In Workshop on Algorithmic Foundations of Robotics (WAFR’04), 2004. [6] Wang, Y.H. and Hong, B.R., Cooperative Multiple Mobile Targets Capturing Algorithm for Robot Troops, Journal of Xi’an Jiaotong University, vol.37, no.5, pp.573-576, 2003. A METHOD OF MULTI-ROBOT FORMATION WITH THE LEAST TOTAL COST 371 [7] Zhou, P.C. and Hong, B.R., Group Robot Pursuit-evasion Problem Based on Game Theory, Journal of Harbin Institute of Technology, vol.35, no.9, pp.1056-1059, 2003. [8] Isler, V., Kannan, S. and Khanna, S., Locating and Capturing an Evader in a Polygonal Environment, Workshop on Algorithmic Foundations of Robotics (WAFR’04), 2004. [9] Rajko, S. and LaValle, S.M., A Pursuit-Evasion BUG Algorithm, IEEE International Conference on Robotics and Automation, pp.1954-1960, 2001. [10] Song, M.P., Gu, G.C. and Zhang, R.B., Distributed Control System for the Cooperative Task of Multi-Mobile Robots, Robot, vol.25, no.5, pp.456-460, 2003. College of Computer Science and Technology, Harbin Engineering University, Harbin, Heilongjiang 150001, CHINA E-mail: songmeiping@hrbeu.edu.cn