Automatic generation of biped walk behavior using genetic algorithms Hugo Picado1,2 , Marcos Gestal3 , Nuno Lau1,2 , Luis P. Reis4,5 , Ana M. Tomé1,2 hugopicado@ua.pt, mgestal@udc.es, nunolau@ua.pt, lpreis@fe.up.pt, ana@ua.pt 1 2 Institute of Electronics and Telematics Enginnering of Aveiro, Portugal Dep. of Electronics, Telecommunications and Informatics, Univ. Aveiro, Portugal 3 Artificial Neural Network and Adaptive System Lab., Univ. Coruña, Spain 4 Artificial Intelligence and Computer Science Lab., Univ. Porto, Portugal 5 Faculty of Engineering of the University of Porto, Portugal Abstract. Controlling a biped robot with several degrees of freedom is a challenging task that takes the attention of several researchers in the fields of biology, physics, electronics, computer science and mechanics. For a humanoid robot to perform in complex environments, fast, stable and adaptive behaviors are required. This paper proposes a solution for automatic generation of a walking gait using genetic algorithms (GA). A method based on partial Fourier series was developed for joint trajectory planning. GAs were then used for offline generation of the parameters that define the gait. GAs proved to be a powerful method for automatic generation of humanoid behaviors resulting on a walk forward velocity of 0.51m/s which is a good result considering the results of the three best teams of RoboCup 3D simulation league for the same movement. Key words: biped, locomotion, genetic algorithms, humanoid, robotics 1 Introduction For a long time, wheeled robots were used for research and development in the field of Artificial Intelligence and Robotics [1]. However, wheeled locomotion is not adapted to many human environments [2]. This increased the interest in biped locomotion and especially in humanoid robotics. Biped locomotion control is a complex problem to solve because the goal is to achieve stability counting on a small support area considering the several degrees of freedom of a biped. The most common approach consists of finding a set kinematics trajectories and using stabilization criteria to ensure that the generated gait is stable. In this paper, the Center of Mass (CoM) was used to monitor the quality of the gait. If the CoM remains inside the support polygon (the convex hull of the contact points of the feet with the ground), the walk is considered statically stable. Other methods may be used as stabilization criteria. The most popular is the Zero Moment Point (ZMP), (Vukobratovic, 1972) [3], defined as the point on the ground about which the sum of the moments of all the active forces equals zero. If ZMP remains inside the support polygon, the gait is considered dynamically stable. ZMP considers the kinematics and the kinetics of the system, which requires the computation of complex dynamic equations that are computationally expensive. This paper presents a method based on GA for developing of efficient and robust humanoid behaviors tested in a simulated version of the humanoid robot NAO [4] using the Simspark [5] simulation environment. The remainder of this paper is organized in five more sections. Section 2 defines how the gait was defined and the method developed for joint trajectory planning, section 3 briefly describes the GAs, section 4 defines the configuration of the GA used for the tests, which is followed by the experimental results, presented in section 5. Section 6 presents some conclusions and interesting proposals for future work. 2 The walking gait Some human-like movements are inherently periodic and repeat the same set of steps several times (e.g. walk, turn, etc). The principle of PFS consists of the decomposition of a periodic function into a sum of simple oscillators as represented by the following expression: ( ) 2π f (t) = C + An sin n t + φn , ∀t ∈ < T n=1 N ∑ (1) where N is the number of frequencies, C is the offset, An=1..N are amplitudes, T is the period and φn=1..N are phases. Head2 LShoulder1 RShoulder1 LShoulder2 RShoulder2 Head1 LUpperArm RUpperArm LElbow RElbow Z Yaw Y RHip RThigh1 RThigh2 RKnee RAnkle1 RAnkle2 LHip Pitch LThigh1 LThigh2 Roll X LKnee LAnkle1 LAnkle2 Fig. 1: Developed behaviors: Humanoid structure. Adapted from [4]. Applying these osillators to each joint, a walking gait was developed and the tests were performed with the simulated humanoid NAO in the scope of the RoboCup 3D soccer simulation league using the Simspark Simulation Environment [5]. Figure 1 shows the humanoid structure and the referential axis considered. The figure also shows the referential considered in the experiments. The main idea behind the definition of this gait is to place an oscillator on each joint we pretend to move in order to define its trajectory. The oscillators are placed on the following joints: LShoulder1, RShoulder1, LThigh1, RThigh1, LThigh2, RThigh2, LKnee, RKnee, LAnkle1, RAnkle1, LAnkle2 and RAnkle2. Hence, 12 single-frequency oscillators are used. Since each single-frequency oscillator will have 4 parameters to define, 48 parameters are needed to completely define the gait. It is common to assume a walk sagittal symmetry, which determines the same movements for corresponding left and right sided joints with a half-period phase shift. Hence, it is possible to reduce the number of parameters by half of the original size, resulting on 24 parameters. Additionally, the period of all oscillators should be the same to keep all the joints synchronized by a single frequency clock. This consideration reduces the number of parameters to 19. A set of equations can be obtained for the left-sided joints: fLShoulder1 (t) = C1 + A1 sin (2πt/T + φ1 ) (2) fLT high1 (t) = C2 + A2 sin (2πt/T + φ2 ) fLT high2 (t) = C3 + A3 sin (2πt/T + φ3 ) (3) (4) fLKnee (t) = C4 + A4 sin (2πt/T + φ4 ) fLAnkle1 (t) = C5 + A5 sin (2πt/T + φ5 ) fLAnkle2 (t) = C6 + A6 sin (2πt/T + φ6 ) (5) (6) (7) where fX (t) is the trajectory equation for the joint X, Ai=1..6 are amplitudes, T is the period, φi=1..6 are phases and Ci=1..6 are offsets. The right-sided joints can be obtained with no additional parameters: For roll joints the left and the right side perform the same trajectories over the time. For pitch joints, the right side can be obtained by adding a phase, π, on the corresponding oscillator. The unknown parameters together form the genome that will be used by the genetic algorithm to generate the gait. 3 Genetic algorithms This paper presents an approach based on Genetic Algorithms (GA) [6] for modeling the previously described behavior. GAs were developed following ideas and techniques from genetics and natural selection theories [7]. After generation of an initial population of individuals, and according to the principle of survival of the fittest, they generate the next population by transmitting their genes through several operations. Each individual in this population represents a possible solution to the problem, and each one is represented by a chromosome, which is a strand of numerical values or genes. There are three major operations in a GA (summarized in Figure 2): i Selection: chooses some parents for crossover according to predefined rules (cost function or fitness). ii Crossover: generates offspring from parents, by exchanging some genes according to different schemas: one-point, two-point, uniform, etc. The offspring thus inherits some characteristics from each parent. Although there are several ways to get a new population, the offspring will usually replace an individual of the actual population providing it has a similar fitness. Another option consists on the use of an auxiliary population which will replace the actual one after it is full of individuals. iii Mutation generates offspring by randomly changing one or several genes in an individual. It allows searching for new regions of solutions which, otherwise, would not be explored. Mutation, therefore, avoids the GA to focus only on a local search which, in turn, increases the probability of finding global optima. Initialize random population Selection Crossover Mutation Insert children No Stop? Yes End Fig. 2: General schema for a genetic algorithm. A standard GA proceeds as follows: an initial population of individuals is generated at random. The individuals in the current population are evaluated according to some predefined quality criterion (fitness function). To form a new population (the next generation), individuals are selected according to their fitness. Then some or all of the existing members of the current population are replaced with the newly created members. Creation of new members is done by different operations (crossover, mutation), which (hopefully) should make the new individuals better than the old ones. If the GA has been well designed, the population will converge to an optimal solution to the problem. Finally, two pragmatic criteria are generally used to stop the GA either when a given number of generations passed away or when a given performance error is reached (both set by the user). 4 GA configuration The parameters described in Section 2 were defined by a GA using the GADS toolbox for Matlab [8]. The algorithm creates an initial population of 100 chromosomes initialized randomly. The roulette method used for selection consists of simulating a roulette-wheel where the parents are selected with a probability that is proportional to their fitness. The mutation follows an uniform distribution with a probability defined by pm = 0.5. Crossover uses the scattered method, which creates a random binary vector and selects the genes where the vector is a 1 from the first parent, and the genes where the vector is a 0 from the second parent, and combines the genes to form the child. The fraction of the population that is created by crossover is defined by the parameter pc = 0.8. For the elitism, 10 chromosomes are selected to survive for the next generation. The fitness function has to be chosen carefully in order to achieve good results. For forward walking, a simple but effective fitness function to minimize can be the distance to the ball, assuming that the robot is placed in a fixed position away from the ball for each individual test. Additionally, the torso average oscillation is also used in order to obtain more stable gaits. The final version of the fitness function is stated as follows: f itness = dBall + θ (8) where dBall is the distance to the ball (in meters) and θ is the average oscillation of the torso (in radians per second). The torso average oscillation, θ is a measure calculated with base on the values received from the gyroscope installed on the torso. It is calculated using the following equation [9]: √ ∑N ∑N ∑N 2 2 2 i=1 (yi − y) + i=1 (zi − z) i=1 (xi − x) + (9) θ= N where N represents the number of simulation cycles, xi , yi and zi are the values received from the gyroscope in the ith cycle and x, y and z are the arithmetic mean of gyroscope readings over the time. The velocity is implicitly considered in the evaluation since each test is also time-bounded with a fixed-length time. 5 Experimental results For the optimization process, GADS generates a file containing the parameters, and then the simulator tests those parameters. When the test finishes, the simulator generates a file with the resultant fitness, and GADS associates the fitness with the corresponding parameters. Then the process restarts until being stoped manually or using some criteria. The generation process took 5 entire days to complete using a Core 2 Duo 2.4Gz CPU with 1GB of physical memory. Figure 3 shows the evolution of the fitness. The minimum fitness is 0.20311, which is a very good result. Table 1 shows the values of the best individual for A1..6 , φ1..6 and C1..6 . For the period, T , the optimal generated value was 0.3711. Best: 0.20311 Mean: 2.8653 Fitness value 6 Best fitness Mean fitness 4 2 0 0 50 100 150 Generation 200 250 300 A1 A2 A3 A4 A5 A6 Current best individual Current Best Individual 100 Fig. 3: Evolution of the fitness. 50 0 Table 1: Best generated individual −50 57.1842 0 5.6445 57.1211 39.6205 46.6315 3.7947 −100 5 φ1 2.9594 10 φ2 -2.2855 Number of variables (19) φ3 0.0887 φ4 -1.8292 φ5 1.7640 φ6 -1.2067 C1 15C2 C3 C4 C5 C6 -88.4624 20 3.6390 35.9536 -39.9481 28.5095 -2.9360 Figure 4 shows the evolution of the CoM and the placement of feet in the XY plane. It is possible to note that the robot tends to shift the CoM to the support foot while walking. Another characteristic shown by the same graphic is the large size of the steps. Fig. 4: The CoM and the feet in the XY plane The average velocity (Figure 5a) shows very good results. More than 50 centimeters per second were achieved. This is a good velocity taking into account the torso average oscillation, that is represented in the Figure 5b. The obtained results can be considered good results, comparing with the three best teams of the RoboCup 3D simulation league competition of 2008 (China, Suzhou). They were able to reach forward velocities of 1.20m/s (SEU-RedSun [10]), 0.67m/s (WrightEagle [11]) and 0.43m/s (LittleGreenBats [12])6 . 6 These results were retrieved from the logfiles of the RoboCup 2008 competition, which may be found in http://www.robocup-cn.org/. Torso average scillation (deg/s) Average velocity (m/s) 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 Time (s) 25 30 25 20 15 10 5 0 0 0.5 1 (a) 1.5 2 2.5 Time (s) 3 3.5 4 (b) Fig. 5: (a) Average velocity over time (b) Torso average oscillation over time. Figure 6 shows NAO walking forward using the proposed solution. At t=1.44s the biped already covered a great distance. It is also possible to note the large steps and the height of the steps. Fig. 6: Walking gait screenshots. 6 Conclusion and future work This paper proposed an efficient method for automatic generation of biped locomotion behaviors which consists of combining partial Fourier series and genetic algorithms (GA). GAs are based on the biological evolution of species by applying selection, mutation and crossover operators to a set of chromosomes. A forward walking gait was developed using 12 single-frequency oscillators whose parameters were defined using a GA for offline generation of parameters. GA proved to be very good on achieving the expected results. The generated walking gait is faster and it is also stable and has the great advantage of being less sensitive to the disturbances inherent to the simulation environment. The generated walk was mainly based in the CoM to monitor the quality of the gait. However, as future work, the calculation and monitoring of the ZMP trajectory is essential for achieving dynamic stability. Moreover, motion capturing, which consists of monitoring the human behaviors, provides a great way to define the humanoid behaviors, due to the anthropomorphic characteristics between both. As future work, it would be good to invest not only in genetic algorithms, but in machine learning methods such as reinforcement learning. These methods provide great advantages for the automatic generation of behaviors which reduce, and possibly eliminate, the human intervention during the optimization or learning process. Acknowledgements This research was partially supported by FCT-PTDC/EIA/70695/2006 Project – ”ACORD - Adaptive Coordination of Robotic Teams”. References 1. Robin Murphy. Introduction to AI Robotics. MIT Press (2000). 2. Maria Prado, Antonio Simón, Ana Pérez, and Franscisco Ezquerro. Effects of terrain irregularities on wheeled mobile robot. Robotica, 21:143–152 (2003). 3. Miomir Vukobratovic and Yury Stepanenko. On the stability of anthropomorphic systems. Mathematical Biosciences, 15 i1:1–37 (1972). 4. David Gouaillier, Vincent Hugel, Pierre Blazevic, Chris Kilner, Jerome Monceaux, Pascal Lafourcade, Brice Marnier, Julien Serre, and Bruno Maisonnier. The NAO humanoid: a combination of performance and affordability. CoRR (2008). 5. Oliver Obst and Markus Rollmann. Spark - a generic simulator for physical multiagent simulations. In Gabriela Lindemann, Jörg Denzinger, Ingo J. Timm, and Rainer Unland, editors, MATES, volume 3187 of Lecture Notes in Computer Science, pages 243–257. Springer (2004). 6. John Holland. Adaptation in Natural and Artificial Systems: an introductory analysis with applications to biology, control, and artificial intelligence. The University of Michigan Press (1975). 7. Charles Darwin. On the Origin of Species by Means of Natural Selection. Murray (1859). 8. The MathWorks. Genetic Algorithm and Direct Search Toolbox, User’s Guide, Version 2. The MathWorks (2005). 9. Milton Heinen and Fernando Osório. Applying genetic algorithms to control gait of physically based simulated robots. In Proceedings of 2006 IEEE Congress on Evolutionary Computation, pages 500–505 (2006). 10. Xu Yuan, Shen Hui, Qian Cheng, Chen Si, and Tan Yingzi. SEU-RedSun 2008 soccer simulation team description. In Proceedings CD of RoboCup 2008, China (2008). 11. Xue Feng, Tai Yunfang, Xie Jiongkun, Zhou Weimin, Ji Dinghuang, and Xiaoping Chen Zhang Zhiqiang. Wright Eagle 2008 3D team description paper. In Proceedings CD of RoboCup 2008, China (2008). 12. Sander van Dijk, Martin Klomp, Herman Kloosterman, Bram Neijt, Matthijs Platje, Mart van de Sanden, and Erwin Scholtens. Little Green Bats humanoid 3D simulation team description paper. In Proceedings CD of RoboCup 2008, China (2008).