Farhan Imtiaz Emergence of Reflexive Behavior from Single Muscle Twitches Master Thesis Bio-Inspired Robotics Lab Swiss Federal Institute of Technology (ETH) Zurich Supervision Prof. Dr. F. Iida Dr. H. Gravato Marques August 2011 Preface I would like to thanks all members of the Bio-inspired Robotics Laboratory for the collaboration as well as the great time we have spent together. I would especially like to thank my supervisor, Prof. Fumiya Iida and Dr. Hugo Gravato Marques, for their continous guidness, encouragment and interesting discussions during my master thesis and providing myself with a stimulating environment that laid the foundation for the great work. Abstract Due to growing needs of human society, it is expected that in future robots needs to interact with humans in a more collaborative way. To maximize potential for such interactions, different robots are being developed which imitate characteristics of human body. But still we don’t have any standard technique to control such robots. To solve this problem, development robotics explores mechanism of human development and tries to answer this problem. We are interested to study, how motor control develop through autonomous physical interactions with environment. In this paper, we explored how basic reflexes(the myotatic reflex, reverse myotatic reflex and reciprocal inhibition reflex) can be learned through such interaction. In this paper we used single muscle twitches for interaction with environment. In our approach we look at biology not only for inspiration but also for validation. The reflexes studied here are relatively well understood in biology, and there are models available which can be used to validate our results. We believe, if we find a principle according to which human reflex connectivity can be autonomously developed, we can then potentially apply the same principle to generate appropriate reflexive behaviour in robotic platforms. Contents Abstract ii 1 Introduction 1 2 Motivation 2 3 Necessary conditions for the self-organization of human-reflexes 4 Methods 4.1 Simulation . . . . . . . . 4.2 Self-organization process 4.3 Reflex Activity . . . . . 4.4 Experiment 1 . . . . . . 4.5 Experiment 2 . . . . . . 4.6 Experiment 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 11 12 13 13 14 5 Results 14 5.1 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5.2 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.3 Experiment 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 6 Discussion 20 6.1 Timing, thresholding, linearity and modulation of reflex activity 20 6.2 Emergence of reflex activity using the ECCEROBOT muscle model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 7 Conclusions 22 6 1 Introduction In humans the process of development starts in the uterus well before birth. Here, non-goal directed spontaneous motor activity (SMA) of motor neurons seems to be a major driving force in the development and maturation of the young nervous system [4],[5],[7], [17],[18]. Development at this stage is intrinsically driven by self-organization processes which structure the sensorimotor information flow in the creature according to the systematic exploration of timed sensor and motor activity [12], [21], [22]. During this process, information about the three-dimensional shape and mechanical properties of the body is laid down in the synaptic connectivity of sensorimotor systems [2],[25],[26],[23]. The primary way in which SMA might influence motor learning is to serve as an exploratory mechanism which guides the developmental process [24], [14]. In uterus, the most frequently observed movement is the General Movement (GM) which is a non-goal oriented type of behavior in which all parts of the body participate [4], [5], [6]. GM plays an important role in development of nervous system and can be used to predict intergrity of nervous system [9],[10],[11]. Another observed movement is the Single Muscle Twitch (SMT), which has been observed during sleep in early development and which consists of small contractions of individual muscles [19]. SMTs have been used to explain the development of the withdrawal reflex (in simulation) at an early stage of development [19],[20]. In this paper we follow a similar strategy to self-organize reflexive behaviors involved in the interaction between agnostic and antagonist muscles, namely the Myotatic, the Reverse Myotatic, and the Reciprocal Inhibition reflexes (see Fig. 2). The Myotatic reflex responds to an undesired stretch in the muscle imposed by external loads [30]. When an external load causes a muscle to stretch, it activates its Ia fibers which in turn excite the α motoneurons of the homonymous muscle (i.e. the same muscle). This results in a muscle contraction which opposes the movement generated by the external load. The Reciprocal Inhibition reflex is involved in agonist-antagonist interactions; the stretch in one muscle inhibits the antagonist muscle from contracting and prevents it from counteracting the movement initiated by the stretched muscle [30]. The Re- verse Myotatic reflex prevents muscles from producing excessive forces [30]. Strong contractions cause large force increases in the tendons which activate the Ib fibers. This activity inhibits of the homonymous motor neurons and decrease the activity of the muscle. These reflex behaviors may provide understating about development of locomotion in future [33]. The methodology is shown in Fig. 1. First, a SMT activates a given motorneuron, which through the body mechanical apparatus result in systematic changes in the sensor values. Second, the temporally correlated sensorimotor activity is captured and local body maps are formed accordingly. Third, when an external stimulus is applied the relevant muscles (those identified as connected to the relevant sensor in the second stage) produce involuntary and stereotyped behaviors [17], [18]. Our primary hypothesis in this paper is that reflexes (whether innate or onto genetically acquired) are not mere arbitrary behaviors exploited by evolution; they reflect the geometry and mechanics of the human body. The reminder of this paper is organized as follows. The second section introduces the ECCEROBOT platform as the main motivation for this paper. The third section describes the necessary conditions for the emergence of the reflexes mentioned above. The forth and fifth section describe the methods and results obtained respectively. The sixth section discusses the results obtained. The last section provides the main conclusions of our work. 2 Motivation The current work is done in the context of the ECCEROBOT project, which attempts to identify suitable approaches to control compliant tendon-driven robots (see Fig. 3). The ECCEROBOT is a hand molded anthropomimetic robot, which aims to capture as close as possible the compliance and complexity of the human musculo-skeletal system [35],[41] The actuation principle of the robot consists of a DC motor and a gearbox in series with piece of kiteline and an elastic shock cord. The motor and the shock cord are attached to different limb parts. When actuated the motor winds up the kiteline and stretches the shock cord, which produces a force on the attachment point and produces an analogue to a muscle contraction. I. Single Muscle Twitches body mechanical structure SMT actuators induced sensor responses sensors II. Information Structure (body mappings) actuators sensors III. Reflexive Behaviour externally induced sensor information reflexive behaviour sensors actuators Figure 1: Methodology α Ia α α α Ia Ib (a) Ib (b) Figure 2: The reflexes investigated in this paper. The stars represent αmotoneuron, the large solid circles represent inhibitory interneurons, semicircled arcs represent excitatory connections, small solid circles represent inhibitory connections. Type Ia sensors provide information about the changes in the muscle length and type Ib sensors mainly provide information about changes in muscle tension. a) The Myotatic reflex is carried out through an excitatory connection between the Ia fibers and the α-motor neurons of the homonymous muscle. The reciprocal inhibition reflex is carried out through an inhibitory connection between the Ia fibers with the α-motoneurons of the antagonist muscle; this connection is mediated by an inhibitory interneuron. b) The reversed Myotatic reflex is carried out through an inhibitory connection between the Ib fibers and the α-motoneurons of the homonymous muscle; this connection is also mediated by an inhibitory interneuron. Figure 3: The anthropomimetic ECCEROBOT [35],[41]. The lack of an appropriate model of the robot, and the complexity of the interactions between the different body parts led us to adopt a developmental approach in which the sensorimotor interactions observed are used to autonomously develop increasingly complex behaviors [1], [2], [3]. This paper describes the first stage of such investigation. In our approach we look at biology not only for inspiration but also for validation. The reflexes seek here are relatively well understood in biology, and there are models available which can be used to validate our results [27], [28], [29], [30], [31]. If we find a principle according to which human reflex connectivity can be autonomously developed, we can then potentially apply the same principle to generate appropriate reflexive behaviour in our robotic platform. 3 Necessary conditions for the self-organization of human-reflexes In this paper we hypothesize that at least some human reflexes (those investigated here) reflect the nature of the human body, and this requires a system which has similar properties to those of the human muscular-skeletal system. We propose four necessary conditions for the emergence of the three reflexes proposed in this paper: 1. Availability of Length and Force sensors. The reflexes investigated involve sensors which can measure the changes in muscle length (Ia fibers) as well as changes in muscle force (Ib fibers). The first condition requires each muscle in the system to be equipped with both types of sensors. 2. Agnostic-Antagonistic muscle pair. The reflexes investigated require the interaction between agnostic and antagonist muscles. Therefore the testing system must consist of agnostic and antagonist muscle pairs. 3. Perturbation-free Environment. We hypothesize that our reflexes will only emerge if the probability of external perturbations is small. In humans, during the first stages of development the environment is rather static and predictable, which favors the learning of basic sensory motor interactions [8], [13], [15], [16]. This is certainly valid for foetus in the womb as well as for newborn babies while sleeping. One of our hypothesis is that a static environment is fundamental for the development of basic reflexes. 4. Biological muscle model. The muscle model used must have similar properties to those of the human muscle. First, the muscles must have asymmetrical conditioning, i.e they can only produce active force when contracting but not when extending. Second, the muscle should impose a very small resistance to movement when relaxed. And third, the muscle should not go slack. While the first two conditions are self-explanatory, the others require empirical data to be verified. We will investigate the impact of external perturbations in the system (see Section 4.4 and 4.5) and compare the reflexes acquired using a human muscle model with those acquired using the muscle model used in the ECCEROBOT (see Section 4.4 and 4.6). 4 4.1 Methods Simulation For this investigation we used a virtual 3D model of the human legs. This model encloses one hip joint, one knee joint and one ankle joint in each leg. In our experiments only the hip joints were used; they are modeled as ball and socket joints (see Fig. 4). For the actuation we use one agnostic and antagonist pair of muscles in each leg: the Biceps femoris and the Quadriceps femoris, which (in our case) are mainly responsible for hip extension and flexion, respectively. Each muscle is simulated as a straight line between two rigid bodies (see Fig. 4) and it includes two types of sensors: one that measures the length of the muscle, i.e. the distance between the two attachment points, and one that measures the force at the attachment points. The derivatives of these signals provide information analogue to the Ia fibers and the Ib fibers. Two additional muscles have been implemented in each leg to test the reflexes; these muscles are external to the legs and are used only to produce external sensory stimulation. The simulation of the leg dynamics has been carried out using the ECCEROBOT physics simulator – Caliper [32]. The framework is based on Bullet Physics [34] and OpenGL graphics [40], and it allows to simulate in real time the interactions between a large number of rigid bodies, different types of joints and different muscle models. The main muscle model used in our investigation is based on a 2-element Nonlinear Hill model [36], [39], [37]. This model captures in a simple way, the contraction of muscle fibers as well as basic muscle dynamics. The two elements are an active contractile element in parallel with a passive elastic element. The contractile element models the active force generated by the muscle fibers. This element includes a damping mechanism that simulates the force-velocity relation of the human muscle. The passive elastic element Figure 4: Diagram of the hip model implemented. Each muscle consists of a straight line (represented with dashes) connecting two attachment points (filled circles). There are four muscles in the system: the Quadriceps (in our system responsible for hip extension), the Biceps (in our system responsible for hip flexion), and two external muscles (Ext1 and Ext2) which are used only to stimulate the reflex activity. Each muslce has two sensors: one that estimates the force, F, produced at the attachment points, and one that estimates the length, L, of the muscle. models the muscle fiber’s resistance to deflection and prevents the muscle from getting slack. The force produced at the attachment points is given by: F = FCE + FP E (1) where FCE is the force produced by a contractile element, and FP E is the force produced by a passive spring element. These forces are given by: α 1 + C.2m = KP E .(lt − l0 ) FCE = FP E where, C and KP E are constant factors, α is the motor activation, vm is the length change of the muscle, lt is the current length of the muscle and lr is the resting length of the muscle. The contractile element includes a damping mechanism that simulates the force-velocity relation of the biological muscles [39]. The force generated by the passive element simulates the muscle resistance to deflection and prevents the muscle from getting slack. In our system, as in biology, the passive force FP E of the muscle is significantly smaller than the force FCE generated by the contractile element. For comparison, we use a second muscle model which simulates the dynamics of the ECCEROBOT artificial muscle. The model simulates the dynamics of a DC motor with a gearbox in series with a kite line and an elastic shock cord. When the motor is actuated it winds up the kite line and expands the shock cord; this produces a force that brings the attachment points closer and simulates an analogue to a muscle contraction. The force, F , produced by the muscle is given by: F = FS + FD = KLS△ + D d LS dt △ (2) where FS is the spring force in the kiteline, FD is a spring damping force, K is the spring constant, and D is the damping constant. As muscles can only pull and not push the additional condition, F ≥ 0, is added. The force, FM , generated by the motor on the kite line is calculated as: FM = τLG r (3) where r is the radius of the motor shaft, and τLG is the torque generated by the gear box. This torque is calculated as: τLG = NητLM − τCG − ωG µG − JG dωG dt (4) where, N is the gearbox ratio, η is the gearbox efficiency, τCG is the Coulomb friction gearbox torque, µG is the gearbox viscous friction constant and JG is the gearbox inertia, and τLM is the motor load torque. The motor load torque is given by: τLM = KT .i − JM dωM − µM ωM − τCM dt (5) where KT is the torque constant, i is the motor current, JM is the motor inertia, µM is the motor viscous friction constant , ωM is the motor angular velocity, and τCM is the Coulomb motor friction torque. We used the same parameters as those identified in [32] using Multiple Regression Analysis from observed data. The parameters used in our simulation are shown in Table 4.1 Parameter KP E C K D r N η τCG µG JG τLM KT JM µM τCM Value 1 106 10000 500 0.005 m 100 0.09618 0.001 x 0.001 2e-8 0.001 7.99e-3 1e-6 0.001 0.001 4.2 Self-organization process The main purpose of the self-organization process is to identify the relationship between sensor and motor activity (step II in Fig.1). This is done in two stages. In the first stage we identify which motor is connected to each sensor; in the second stage, we quantify the strength of these connections and characterize their nature (excitatory or inhibitory). In the first stage we normalize the sensor and motor activity as follows: Sj = ( 0, if Nmin ≤ Ṡj ≤ Nmax 1, otherwise , Mi = ( 0, 1, if Mi = 0 otherwise (6) where Sj is the normalized activity of sensor j, Ṡj is the derivative of sensor j, Mi is the normalized activity in motor i, and Mi is the signal sent to motor i. Nmin and Nmax are the noise boundaries extracted from the sensor signals collected when the system is at rest (i.e. with no motor activation). The connectivity is calculated as: Ci,j xCorr(Sjl , Mi ) = max , l = 0, ...., maxLag xCorr(M0i , M0i ) (7) where xCorr(X, Y ) is the cross-correlation between signals X and Y , Sjl is the normalised sensor signal j lagged l samples, and maxLag is the maximum lag allowed for the sensor data. . This connectivity is then thresholded. Values above a certain threshold, T , establish a connection between a given sensor and a given motor, and values below this threshold establish no such connection: ( 1, if Ci,j > T (8) Ai,j = 0, otherwise where A is represents the adjacency matrix with the connectivity between sensors and actuators. The connectivity is characterised and quantified as: Qi,j = − R t 0 Rt 0 Ṡj dt Mi dt.max(Ṡj ) (9) where Q is the final connectivity matrix. Excitatory connections are characterized by a positive value and inhibitory connections by a negative value. The strength of the connection is given by its magnitude. In reality this process is very similar to that of differential Hebbian learning [38] using standard cross correlation; but our method allows for a more intuitive way of setting the connectivity threshold. 4.3 Reflex Activity Intuitively, the connectivity in Q describes motor-to-sensor connections, as the directed flow of information is from motors to sensors. However, Q can also describe directed sensor-to-motor connectivity. This process is based on the idea of motor-directed somatosensory imprinting (MSDI) proposed in [19]. This formulation can be seen as a reverse type of Hebbian learning where “the post-synaptic activity in reflex interneurons [and motoneurons] precedes the afferent input” [19]. In this way the reflex activity is given by the external sensor stimulation measured in each sensor, weighted by the respective sensor-to-motor connection: m X Ṡj Mi = Qi,j ) (10) max( Ṡ j j=1 To allow for a stable simulation we compute the motor activity based on an average of five samples of the sensor derivative signals. The collection of the five samples starts as soon as one derivative is identified that goes above the thresholds identified during the learning stage – Nmin and Nmax . The motor signal is kept constant for the next five samples, time at which sensor readings are averaged and a new motor signal is computed. To produce the external stimulus we contract the external muscles, Ext1 and Ext2, in each leg. Each of these muscles produces a given displacement in one of the limbs. This displacement is then caught by the different muscle sensors and converted into reflex motor activity using the matrix Q. 4.4 Experiment 1 The main goal of Experiment 1 is to verify that the reflex connectivity can by learned using SMTs and the self-organization process mentioned above. This experiment is carried out using the Hill muscle model (see Section 4.1). The simulation starts with all the muscles relaxed. In this condition the legs fall straight down due to the effects of gravity. We start the learning process by performing extremely small contractions in all the leg muscles to produce some sensor noise. We have tried different perturbations ranging from one order of magnitude lower than the magnitude of the STMs to half of the magnitude of the muscle twitches. No significant effects were observed on the results obtained. During this time sensor data is collected and the noise boundaries (Nmin and Nmax ) identified. The noise collection lasts for 20 seconds. After the noise collection interval we perform a number of SMTs in each muscle (30 in total). This is done by continuously selecting a muscle randomly (from a uniform distribution) and contracting the selected muscle with a constant activation for a short period of time. After a SMT the system waits a fixed time to select the next muscle to twitch. The waiting time is fixed to a value large enough to allow the system to stabilise, i.e. to stop oscillate. During the SMTs the sensor and motor signals are collected at a rate of 20Hz. The testing of the reflexes is then carried out by contracting each one of the external muscles and allowing the reflexive behavior to be expressed. 4.5 Experiment 2 The goal of Experiment 2 is to investigate the role of external perturbations during the learning stage (i.e. during the SMTs) and validate the hypothesis that perturbations affect significantly the reflex connectivity obtained (see third necessary condition in Section 3). For this purpose we repeated Experiment 1 but now we introduce small perturbation in the system with a given probability. We varied the perturbation probability from 0 to 1 at 0.05 intervals. The perturbations consisted of short contractions with constant intensity (half the magnitude of the SMT) carried out by the external Motor activity during SMT MRQ 2 1.5 1 0.5 0 MRB 2 1.5 1 0.5 0 MLQ 2 1.5 1 0.5 0 MLB 2 1.5 1 0.5 0 0 100 200 300 400 500 600 700 Time (s) Figure 5: Motor activity produced by SMTs. muscles . 4.6 Experiment 3 The goal of Experiment 3 is to investigate the impact of a non-biologically inspired muscle model – the ECCEROBOT muscle model – on the reflex connectivity acquired (Section 3). For this purpose we repeated Experiment 1 using the ECCEROBOT muscle model described in Section 4.1. There are two main differences between the two muscle models. First, contrary to the biological muscles, the ECCEROBOT muscles do not passively extend when relaxing; extension is limited to the extension of the shock cord in series with the motor. Second, the ECCEROBOT muscles do go slack when the motor unwinds beyond the muscle resting length. 5 5.1 Results Experiment 1 Figure 5 shows the raw motor activity produced by the SMTs. As can be seen the twitches occur only one at the time. Figure 6 shows the raw force data collected for the two muscles of the right leg during a SMT carried out by the Quadriceps (left) and by the Biceps (right). As can be seen activity in the Quadricep Femoris Bicep Femoris 7 7 M M RQ 6 RB 6 F RQ 5 RB 4 3 2 2 0 0 1 1.5 2 2.5 3 Time (s) 3.5 4 4.5 RQ 3 1 0.5 F 4 1 −1 0 RB 5 F Force (N) Force (N) F −1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Time (s) Figure 6: Raw force responses of the force sensors of the Quadriceps (blue) and Biceps (green) for the right leg in response to a SMT (red) carried out by the by the Quadriceps (left) and by the Biceps (right). force sensor is only observed during contractions of the homonymous muscle but not during contractions of the antagonist muscle. This is because when a muscle is relaxed the only force in the muscle is due to the passive element which has a negligible magnitude when compared with the active force that can be produced by the muscle. Similar results have been observed for the left leg. Figure 7 shows the raw length data collected for the two muscles of the right leg during a SMT carried out by the Quadriceps (left) and by the Biceps (right). Contrarily to the force sensors, the length sensors change their values in response to contractions of the homonymous as well as the antagonist muscles. Similar behaviour have been observed for the left leg. The data collected suggests a connectivity between the force sensors and their homonymous motor, and a connectivity between the length sensors and both their homonymous and antagonist muscles. Figure 8a shows that this is in fact the connectivity obtained. As expected no connectivity exists between muscles in one leg and sensors from the other. The quantified matrix is shown in Fig. 8b. As can be seen the connectivity obtained is in qualitative terms similar to that observed in rofelation to human reflexes. First, we obtain excitatory connections between the length sensors and their homonymous muscles (as in the Myotatic reflex). Second, we obtain inhibitory connections Length (cm) Quadricep Femoris 2 1 0 M −1 −2 0 L RQ 2 4 6 8 10 12 L RQ 14 RB 16 18 Length (cm) Time (s) Bicep Femoris 2 1 0 −1 M −2 −3 0 L RB 2 4 6 8 10 12 L RB 14 RQ 16 18 Time (s) Figure 7: The activity of the length sensors relative to the Quadriceps (blue) and Biceps (green) in the right leg in response to a SMT (red) carried out by the by the Quadriceps (left) and by the Biceps (right). For clarity, we have subtracted the resting lengths to the total length of each muscle. between the length sensors and their antagonist muscles. And third, we obtain inhibitory connections between the force sensors and their homonymous muscles. We tested the muscle activity in response to a muscle stretch caused by the artificial muscles. Figure 9 shows the muscle activity when an external load increases the length of the Quadriceps muscle of the right leg. As can be seen the first reaction to the muscle stretch causes the Quadriceps to contract, and at the same time inhibits the Biceps from contracting. The first activation of the Quadriceps slows down the length increase and causes a drop in the following activation. This behavior is consistent with that observed in relation to the Myotatic and Reciprocal Inhibition reflexes. To investigate the effect of the Reverse Myotatic in the decrease of muscle activity we repeated the experiment with and without the using the force sensor, and compared the muscle activations in both conditions. We observed that the decrease in activity is mainly due to the decrease drop in the muscle increase caused by the first activation then due to the Myotatic reflex. The drop due to the force component is in fact very small - 0.25 motor units. Although small the behavior is also consistent with the Reverse Myotatic reflex (see Section 6.1 for discussion). Adjacency Matrix MRQ MRB MLQ MLB FRQ F RB FLQ F L LB RQ L LRB L LB LQ (a) Quantified Adjacency Matrix MRQ MRB MLQ MLB FRQ FRB FLQ FLB L RQ LLB LRB L LQ (b) Figure 8: The adjacency matrices A (a) and Q (b). M stands for motor, F stands for force sensor and L for length sensor; in the subscripts, RQ stands for Quadriceps of the right leg, RB Biceps of the right lef, LQ Quadriceps of the left leg, and LB Biceps of the left leg. Black blocks represent excitatory connections and white blocks represent inhibitory connections. Quadriceps Femoris Reaction Load 40 20 0 LRQ 40 35 30 FRQ 20 10 0 M RQ 2 1 0 MRB 0 −0.2 −0.4 0 1 2 3 4 Time (s) Figure 9: Muscle activity generated in reation to a stretch in the Quadriceps caused by an external load. Quantified Adjacency Matrix MRQ MRB M LQ M LB FRQ LLB L L FRB FLQ FLB L RB LQ RQ Figure 10: Quantified Adjacency Matrixc with Perturbations 5.2 Experiment 2 In Experiment 2 we investigate the effect of perturbations during the selforganization process. Our results indicated that for a probability of perturbations higher than 0.1 we cannot obtain the right reflex matrix using our framework. This is because external perturbations induce information in the sensor signals which interfere with those produced by the SMTs. Figure. 10 shows the connectivity obtained for perturbations of 0.2. As can be seen connections are mistakingly identified between length sensors in the right leg and muscles in the left leg. In addition, the system presents several missing connections between length sensors and homonymous and antagonist muscles in both legs. The force connectivity is correct because the force values are only significant when the homonymous muscles contract (as in biology). 5.3 Experiment 3 Experimnet 3 investigates the effect of using a different muscle model on the self-organization process. For this experiment we used the ECCEROBOT muscle model in which a non-back drivable motor produces a strong resistance to movement when no voltage is applied. Figure. 11 shows the connectivity obtained. As can be seen the connectivity between the length sensors is the same as in the human muscle model. This connectivity is valid Quantified Adjacency Matrix MRQ MRB M LQ M LB FRQ LLB L L FRB FLQ FLB L RB LQ RQ Figure 11: Quantified Adjacency Matrix with ECCEROBOT Muscle Model since when one muscle contracts it extends the shock cord of the antagonist and decreases the length of the homonymous muscle. On the force sensor connectivity we observe an extra inhibitory connection between force and the antagonist muscle. This occurs because when contracting one muscle the tension on the antagonist muscle will also increase due to the lack of back-drivability in the motors. 6 6.1 Discussion Timing, thresholding, linearity and modulation of reflex activity The main goal of this paper is not so much to demonstrate appropriate human-like reflex behaviour but rather to show that the correct reflex connectivity can emerge from SMT. When it comes to behavioral expression even the simplest reflexes (as those investigated here) is far from trivial. First, different reflexes have different activity thresholds, for example the Reverse Myotatic reflex seems to require a significantly higher threshold to be activated than the Myotatic reflex. At the moment it is not clear how this can be developed in an automated way. Second, different reflexes appear at different times depending on the num- ber of interneurons they entail. For example, the Reverse Myotatic reflex takes longer to be active than the Myotatic reflex because it entails one interneuron in its path while the Myotatic entails none. In our platform the Reverse Myotatic reflex also appears later but this is due the fact that force only appears after the muscle starts contracting; this is also valid in humans. Third, it is quite unlikely that the motor signal is a linear combination of the sensor values. Typically a non-linear function such as the sigmoid is used. Although that can easily be incorporated into our system, we see no direct benefit in doing so at such a preliminary stage of our work. Forth, and most important aspect, all the reflex connectivity can be modulated by the supra-spinal and the central nervous systems (REF). This means that the gains of all the connection weights can be manipulated from hierarchically superior systems, allowing for different reflex activity in different behaviors. This is important because it reduces the relevance of the exact strength of the connectivity, and places the highest emphasis on the nature of the connectivity identified (inhibitory or excitatory) which cannot be modified. Nonetheless, we believe that in artificial systems it is relevant to identify at least an appropriate order of magnitude for the weights which is in accordance to the range of the raw sensor data. 6.2 Emergence of reflex activity using the ECCEROBOT muscle model In the experiment carried out using the ECCEROBOT muscle model we can see that our approach scales rather well to produce relevant reflex activity. In the ECCEROBOT model, negative motor signals (or the inhibition of motor activity) will cause the motors to drive backwards, allowing for an active extension of the muscles. Although the process of muscle extension differs from that in humans it produces a similar consequence, i.e. it extends the muscle. Relative to the Myotatic and Reciprocal Inhibition reflexes the behavior of the artificial muscle mode is expected to produce similar behavior; the contraction of one muscle will lead to the active extension of the other. The difference is on the Reverse Myotatic reflex which now has an extra inhibitory connection with the antagonist muscle. This connection is actually relevant in the platform as it prevents high force increases not only in the homonymous but also in the antagonist muscles. As the ECCEROBOT motors are not back-drivable forces produced by one muscle will necessarily be reflected on the antagonist muscles. The extra inhibitory connection is then essential to actively relax the antagonist muscle and prevent high force increases there. 7 Conclusions The main goal of this paper is to show that one can obtain reflex connectivity (analogue to that observed in the human spinal cord) using a self-organization process and a simple babbling strategy consisting of SMTs. We have shown that our strategy works for three different reflexes: the Myotatic Reflex, the Reverse Myotatic Reflex and the Reciprocal Inhibition reflex. In addition we have shown that the correct connectivity can only be acquired when the influence of external perturbations during the learning stage is limited. In addition we have shown that different muscle models potentiate different connectivity, which suggests that the reflex connectivity in humans reflects the nature and geometry of the human muscular-skeletal system. References [1] M. Asada, K. F. MacDorman, H. Ishiguro and Y. Kuniyoshi Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robotics and Autonomous Systems 37:185-93, 2001. [2] M. Asada, K. Hosoda, Y. Kuniyoshi, H. Ishiguro, T. Inui, Y. Yoshikawa, M. Ogino and C. Yoshida. Cognitive Developmental Robotics: A Survey. In IEEE Transactions on Autonomous Mental Development 1(1):12-34, 2009. [3] M. Lungarella, G. Metta, R. Pfeifer and G. Sandini. Developmental robotics: a survey, Connection Science 15(4):151-190, 2003. [4] H. F.R. Prechtl. Qualitative changes of spontaneous movements in fetus and preterm infant are a marker of neurological dysfunction, Early Human Development 23:151-8, 1990. [5] M. Hadders-Algra. Putative neural substrate of normal and abnormal general movements. Journal of Pediatrics 145:S12-S16, 2004. [6] A. B. Lchinger, M. Hadders-Algra, C. van Kan and J. de Vries. Fetal Onset of General Movements. Pediatric Research 63:191-5, 2008. [7] B. S. Kisilevsky and J. A. Low. Human Fetal Behaviour: 100 Years of Study. Developmental review 18:1-29, 1998. [8] C. M. van Kan, J. I.P de Vries, A. B. Lchinger, E. J.H. Mulder and M. A.M. Taverne. Ontogeny of fetal movements in the guinea pig. Physiology and Behaviour 98:338-44, 2009. [9] H. F.R Prechtl, C. Einspieler, G. Cioni, A. F. Bos, F. Ferrari and D. Sontheimer. An early marker of developing neurological handicap after perinatal brain lesions. Lancet 339: 1361-1363, 1997. [10] S. E. Groen, A. C.E. de Blcourt, K. Postema, M. Hadders-Algra. Quality of general movements predicts neuromotor development at the age of 912 years. Developmental Medicine and Child Neurology 47: 731-8, 2005. [11] M. Hadders-Algra. Putative neural substrate of normal and abnormal general movements. Neuroscience and Biobehavioural Reviews 31:118190, 2007. [12] O. Sporns and G. M. Edelman. Solving Bernsteins Problem: A Proposal for the Development of Coordinated Movement by Selection. Child Development 64:960-81, 1993. [13] W. P. Smotherman and S. R. Robinson. The Development of Behaviour Before Birth. Developmental psychology 32(3):425-434, 1996. [14] L. Berthouze and Y. Kuniyoshi. Emergence and Categorization of Coordinated Visual Behaviour Through Embodied Interaction. Machine Learning 31:187-200, 1998. [15] K. M. Newel and D. E. Vaillancourt. Dimensional change in motor learning. Human Movement Science 20:695-715, 2001. [16] L. Berthouze and M. Lungarella. Motor Skill Acquisition Under Environmental Perturbations: On the necessity of Alternate Freezing and Freeing Degrees of Freedom. Adaptive Behaviour 12(1):47-64, 2004. [17] Y. Kuniyoshi and S. Sangawa. Early motor development from partially ordered neural-body dynamics: experiments with a cortico-spinalmusculo-skeletal model. Biological Cybernetics 95:589-605, 2006. [18] H. Mori and Y. Kuniyoshi. A human fetus development simulation: Selforganization of behaviors through tactile sensation. In IEEE 9th International Conference on Development and Learning p.82-7, 2010. [19] P. Petersson, A. Waldenstrm, C. Fhraeus and J. Schouenborg. Spontaneous muscle twitches during sleep guide spinal self-organization, Nature 424, 72-75, 2003. [20] S. Grillner. Muscle twitches during sleep shape the precise modules of the withdrawal reflex. TRENDS in Neurosciences 27(4):169-71, 2004. [21] M. Lungarella and O. Sporns. Mapping Information Flow in Sensorimotor Networks. PloS Computational Biology 2(10):1301-12, 2006. [22] E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience 10:186-198, 2009. [23] R. Pfeifer, M. Lungarella and F. Iida. Self-Organization, Embodiment, and Biologically Inspired Robotics. Science 318:1088-93, 2007. [24] A. A. Penn and C. J. Shatz. Brain waves and brain wiring: the role of endogenous and sensory-driven neural activity in development. Pediatrics Research 45:447-58. [25] P. Rochat, Self-perception and action in infancy, Research, pp. 102109, 1998. Experimental Brain [26] P. Rochat and T. Striano, Perceived self in infancy, Development, pp. 513530, 2000. Infant Behavior [27] N. Kudo and T. Yamada. Development of the monosynaptic stretch reflex in the rat: an in vitro study. Journal of Physiology 369:127-44, 1985. [28] B. Myklebust and G. Gottlieb. Development of the stretch reflex in the newborn: Reciprocal excitation and reflex irradiation. Child Development 64(4):1036-45, 1993. [29] A. Levinsson, M. Garwicz and J. Schouenborg. Sensorimotor transformation in cat nociceptive withdrawal reflex system. European Journal of Neuroscience 11:4327-32, 1999. [30] M. Bear, B. Connors and M. Paradiso. Neuroscience. Lippincott-Williams and Wilkins(2nd ed.), 2001. S. Katz (ed.), [31] H-H. Chen, S. Hippenmeyer, S. Arber and E. Frank. Development of the monosynaptic stretch reflex circuit. Current Opinion of Neurobiology 13:96:102, 2003. [32] S. Wittmeier, M. Jäntsch, K. Dalamagkidis and A. Knoll. Physics-based Modeling of an Anthropomimetic Robot IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011. [33] H. Geyer and H. Herr. A Muscle-Reflex Model That Encodes Principles of Legged Mechanics Produces Human Walking Dynamics and Muscle Activities. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, VOL. 18, NO. 3, 2010. [34] E. Coumans. Bullet Physics Library. Sony Computer Entertainment. [Online]. Available: http://www.bulletphysics.com [35] Embodied Cognition In A Compliantly Engineered Robot (ECCEROBOT). [Online]. Available: http://www.eccerobot.eu [36] M. Berniker. Linearity, Motor Primitives and Low-Dimensionality in the Spinal Organization of Motor Control. Unpublished doctoral dissertation, MIT USA, 1971. [37] F. E. Zajac. Muscle and tendon: properties, models, scaling and application to biomechanics and motor control. Critical Reviews in Biomedical Engineering, vol. 17, no. 4, pp. 359410, 1989. [38] B. Kosko. Differential hebbian learning, vol. 151, no. 1, pp. 27782, 1986. AIP Conference Proceedings, [39] A. V. Hill The heat of shortening and dynamics constants of muscles. Proc. R. Soc. Lond. B (London: Royal Society) 126 (843): 136195, 1938. [40] Open Graphics Library.[Online]. Available: http://www.opengl.org [41] O. Holland and R. Knight. The anthropomimetic principle. J. Burn and M. Wilson (eds.) Proceedings of the AISB06 Symposium on Biologically Inspired Robotics, 2006. [42] J. L. Elman. Learning and development in neural networks. 48:71-99, 1993. [43] D. Hebb. The organization of behavior. 1949. Cognition Wiley and Sons, New York, [44] S. F. Giszter and W. J. Kargo, Modeling of dynamic controls in the frog wiping reflex: Force-field level controls, Neurocomputing, pp.12391247, 2001. Bio-Inspired Robotics Lab Prof. Dr. F. Iida Title of work: Emergence of Reflexive Behavior from Single Muscle Twitches Thesis type and date: Master Thesis, August 2011 Supervision: Prof. Dr. F. Iida Dr. H. Gravato Marques Student: Name: E-mail: Legi-Nr.: Semester: Farhan Imtiaz imtiazf@student.ethz.ch 06-936-246 Final