Emergence of Reflexive Behavior from Single Muscle Twitches

advertisement
Farhan Imtiaz
Emergence of Reflexive
Behavior from Single Muscle
Twitches
Master Thesis
Bio-Inspired Robotics Lab
Swiss Federal Institute of Technology (ETH) Zurich
Supervision
Prof. Dr. F. Iida
Dr. H. Gravato Marques
August 2011
Preface
I would like to thanks all members of the Bio-inspired Robotics Laboratory
for the collaboration as well as the great time we have spent together. I would
especially like to thank my supervisor, Prof. Fumiya Iida and Dr. Hugo
Gravato Marques, for their continous guidness, encouragment and interesting
discussions during my master thesis and providing myself with a stimulating
environment that laid the foundation for the great work.
Abstract
Due to growing needs of human society, it is expected that in future robots
needs to interact with humans in a more collaborative way. To maximize
potential for such interactions, different robots are being developed which
imitate characteristics of human body. But still we don’t have any standard technique to control such robots. To solve this problem, development
robotics explores mechanism of human development and tries to answer this
problem. We are interested to study, how motor control develop through autonomous physical interactions with environment. In this paper, we explored
how basic reflexes(the myotatic reflex, reverse myotatic reflex and reciprocal inhibition reflex) can be learned through such interaction. In this paper
we used single muscle twitches for interaction with environment. In our approach we look at biology not only for inspiration but also for validation.
The reflexes studied here are relatively well understood in biology, and there
are models available which can be used to validate our results. We believe,
if we find a principle according to which human reflex connectivity can be
autonomously developed, we can then potentially apply the same principle
to generate appropriate reflexive behaviour in robotic platforms.
Contents
Abstract
ii
1 Introduction
1
2 Motivation
2
3 Necessary conditions for the self-organization of human-reflexes
4 Methods
4.1 Simulation . . . . . . . .
4.2 Self-organization process
4.3 Reflex Activity . . . . .
4.4 Experiment 1 . . . . . .
4.5 Experiment 2 . . . . . .
4.6 Experiment 3 . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
11
12
13
13
14
5 Results
14
5.1 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.3 Experiment 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6 Discussion
20
6.1 Timing, thresholding, linearity and modulation of reflex activity 20
6.2 Emergence of reflex activity using the ECCEROBOT muscle
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7 Conclusions
22
6
1
Introduction
In humans the process of development starts in the uterus well before birth.
Here, non-goal directed spontaneous motor activity (SMA) of motor neurons
seems to be a major driving force in the development and maturation of
the young nervous system [4],[5],[7], [17],[18]. Development at this stage is
intrinsically driven by self-organization processes which structure the sensorimotor information flow in the creature according to the systematic exploration of timed sensor and motor activity [12], [21], [22]. During this process,
information about the three-dimensional shape and mechanical properties
of the body is laid down in the synaptic connectivity of sensorimotor systems [2],[25],[26],[23].
The primary way in which SMA might influence motor learning is to serve
as an exploratory mechanism which guides the developmental process [24],
[14]. In uterus, the most frequently observed movement is the General Movement (GM) which is a non-goal oriented type of behavior in which all parts
of the body participate [4], [5], [6]. GM plays an important role in development of nervous system and can be used to predict intergrity of nervous
system [9],[10],[11]. Another observed movement is the Single Muscle Twitch
(SMT), which has been observed during sleep in early development and which
consists of small contractions of individual muscles [19].
SMTs have been used to explain the development of the withdrawal reflex
(in simulation) at an early stage of development [19],[20]. In this paper we
follow a similar strategy to self-organize reflexive behaviors involved in the
interaction between agnostic and antagonist muscles, namely the Myotatic,
the Reverse Myotatic, and the Reciprocal Inhibition reflexes (see Fig. 2). The
Myotatic reflex responds to an undesired stretch in the muscle imposed by
external loads [30]. When an external load causes a muscle to stretch, it activates its Ia fibers which in turn excite the α motoneurons of the homonymous
muscle (i.e. the same muscle). This results in a muscle contraction which
opposes the movement generated by the external load. The Reciprocal Inhibition reflex is involved in agonist-antagonist interactions; the stretch in one
muscle inhibits the antagonist muscle from contracting and prevents it from
counteracting the movement initiated by the stretched muscle [30]. The Re-
verse Myotatic reflex prevents muscles from producing excessive forces [30].
Strong contractions cause large force increases in the tendons which activate the Ib fibers. This activity inhibits of the homonymous motor neurons
and decrease the activity of the muscle. These reflex behaviors may provide
understating about development of locomotion in future [33].
The methodology is shown in Fig. 1. First, a SMT activates a given motorneuron, which through the body mechanical apparatus result in systematic
changes in the sensor values. Second, the temporally correlated sensorimotor
activity is captured and local body maps are formed accordingly. Third,
when an external stimulus is applied the relevant muscles (those identified
as connected to the relevant sensor in the second stage) produce involuntary
and stereotyped behaviors [17], [18]. Our primary hypothesis in this paper
is that reflexes (whether innate or onto genetically acquired) are not mere
arbitrary behaviors exploited by evolution; they reflect the geometry and
mechanics of the human body.
The reminder of this paper is organized as follows. The second section
introduces the ECCEROBOT platform as the main motivation for this paper.
The third section describes the necessary conditions for the emergence of the
reflexes mentioned above. The forth and fifth section describe the methods
and results obtained respectively. The sixth section discusses the results
obtained. The last section provides the main conclusions of our work.
2
Motivation
The current work is done in the context of the ECCEROBOT project, which
attempts to identify suitable approaches to control compliant tendon-driven
robots (see Fig. 3). The ECCEROBOT is a hand molded anthropomimetic
robot, which aims to capture as close as possible the compliance and complexity of the human musculo-skeletal system [35],[41] The actuation principle of
the robot consists of a DC motor and a gearbox in series with piece of kiteline
and an elastic shock cord. The motor and the shock cord are attached to
different limb parts. When actuated the motor winds up the kiteline and
stretches the shock cord, which produces a force on the attachment point
and produces an analogue to a muscle contraction.
I. Single Muscle Twitches
body
mechanical
structure
SMT
actuators
induced
sensor
responses
sensors
II. Information Structure (body mappings)
actuators
sensors
III. Reflexive Behaviour
externally
induced
sensor
information
reflexive
behaviour
sensors
actuators
Figure 1: Methodology
α
Ia
α
α
α
Ia
Ib
(a)
Ib
(b)
Figure 2: The reflexes investigated in this paper. The stars represent αmotoneuron, the large solid circles represent inhibitory interneurons, semicircled arcs represent excitatory connections, small solid circles represent inhibitory connections. Type Ia sensors provide information about the changes
in the muscle length and type Ib sensors mainly provide information about
changes in muscle tension. a) The Myotatic reflex is carried out through an
excitatory connection between the Ia fibers and the α-motor neurons of the
homonymous muscle. The reciprocal inhibition reflex is carried out through
an inhibitory connection between the Ia fibers with the α-motoneurons of the
antagonist muscle; this connection is mediated by an inhibitory interneuron.
b) The reversed Myotatic reflex is carried out through an inhibitory connection between the Ib fibers and the α-motoneurons of the homonymous
muscle; this connection is also mediated by an inhibitory interneuron.
Figure 3: The anthropomimetic ECCEROBOT [35],[41].
The lack of an appropriate model of the robot, and the complexity of
the interactions between the different body parts led us to adopt a developmental approach in which the sensorimotor interactions observed are used
to autonomously develop increasingly complex behaviors [1], [2], [3]. This
paper describes the first stage of such investigation. In our approach we look
at biology not only for inspiration but also for validation. The reflexes seek
here are relatively well understood in biology, and there are models available which can be used to validate our results [27], [28], [29], [30], [31]. If
we find a principle according to which human reflex connectivity can be autonomously developed, we can then potentially apply the same principle to
generate appropriate reflexive behaviour in our robotic platform.
3
Necessary conditions for the self-organization
of human-reflexes
In this paper we hypothesize that at least some human reflexes (those investigated here) reflect the nature of the human body, and this requires a
system which has similar properties to those of the human muscular-skeletal
system. We propose four necessary conditions for the emergence of the three
reflexes proposed in this paper:
1. Availability of Length and Force sensors. The reflexes investigated
involve sensors which can measure the changes in muscle length (Ia fibers) as
well as changes in muscle force (Ib fibers). The first condition requires each
muscle in the system to be equipped with both types of sensors.
2. Agnostic-Antagonistic muscle pair. The reflexes investigated require the interaction between agnostic and antagonist muscles. Therefore
the testing system must consist of agnostic and antagonist muscle pairs.
3. Perturbation-free Environment. We hypothesize that our reflexes
will only emerge if the probability of external perturbations is small. In humans, during the first stages of development the environment is rather static
and predictable, which favors the learning of basic sensory motor interactions [8], [13], [15], [16]. This is certainly valid for foetus in the womb as well
as for newborn babies while sleeping. One of our hypothesis is that a static
environment is fundamental for the development of basic reflexes.
4. Biological muscle model. The muscle model used must have similar properties to those of the human muscle. First, the muscles must have
asymmetrical conditioning, i.e they can only produce active force when contracting but not when extending. Second, the muscle should impose a very
small resistance to movement when relaxed. And third, the muscle should
not go slack.
While the first two conditions are self-explanatory, the others require
empirical data to be verified. We will investigate the impact of external
perturbations in the system (see Section 4.4 and 4.5) and compare the reflexes
acquired using a human muscle model with those acquired using the muscle
model used in the ECCEROBOT (see Section 4.4 and 4.6).
4
4.1
Methods
Simulation
For this investigation we used a virtual 3D model of the human legs. This
model encloses one hip joint, one knee joint and one ankle joint in each
leg. In our experiments only the hip joints were used; they are modeled as
ball and socket joints (see Fig. 4). For the actuation we use one agnostic and
antagonist pair of muscles in each leg: the Biceps femoris and the Quadriceps
femoris, which (in our case) are mainly responsible for hip extension and
flexion, respectively.
Each muscle is simulated as a straight line between two rigid bodies (see
Fig. 4) and it includes two types of sensors: one that measures the length of
the muscle, i.e. the distance between the two attachment points, and one
that measures the force at the attachment points. The derivatives of these
signals provide information analogue to the Ia fibers and the Ib fibers.
Two additional muscles have been implemented in each leg to test the
reflexes; these muscles are external to the legs and are used only to produce
external sensory stimulation. The simulation of the leg dynamics has been
carried out using the ECCEROBOT physics simulator – Caliper [32]. The
framework is based on Bullet Physics [34] and OpenGL graphics [40], and it
allows to simulate in real time the interactions between a large number of
rigid bodies, different types of joints and different muscle models.
The main muscle model used in our investigation is based on a 2-element
Nonlinear Hill model [36], [39], [37]. This model captures in a simple way,
the contraction of muscle fibers as well as basic muscle dynamics. The two
elements are an active contractile element in parallel with a passive elastic
element. The contractile element models the active force generated by the
muscle fibers. This element includes a damping mechanism that simulates
the force-velocity relation of the human muscle. The passive elastic element
Figure 4: Diagram of the hip model implemented. Each muscle consists of
a straight line (represented with dashes) connecting two attachment points
(filled circles). There are four muscles in the system: the Quadriceps (in our
system responsible for hip extension), the Biceps (in our system responsible
for hip flexion), and two external muscles (Ext1 and Ext2) which are used
only to stimulate the reflex activity. Each muslce has two sensors: one that
estimates the force, F, produced at the attachment points, and one that
estimates the length, L, of the muscle.
models the muscle fiber’s resistance to deflection and prevents the muscle
from getting slack. The force produced at the attachment points is given by:
F = FCE + FP E
(1)
where FCE is the force produced by a contractile element, and FP E is the
force produced by a passive spring element. These forces are given by:
α
1 + C.2m
= KP E .(lt − l0 )
FCE =
FP E
where, C and KP E are constant factors, α is the motor activation, vm is the
length change of the muscle, lt is the current length of the muscle and lr is
the resting length of the muscle. The contractile element includes a damping
mechanism that simulates the force-velocity relation of the biological muscles [39]. The force generated by the passive element simulates the muscle
resistance to deflection and prevents the muscle from getting slack. In our
system, as in biology, the passive force FP E of the muscle is significantly
smaller than the force FCE generated by the contractile element.
For comparison, we use a second muscle model which simulates the dynamics of the ECCEROBOT artificial muscle. The model simulates the
dynamics of a DC motor with a gearbox in series with a kite line and an
elastic shock cord. When the motor is actuated it winds up the kite line and
expands the shock cord; this produces a force that brings the attachment
points closer and simulates an analogue to a muscle contraction. The force,
F , produced by the muscle is given by:
F = FS + FD = KLS△ + D
d
LS
dt △
(2)
where FS is the spring force in the kiteline, FD is a spring damping force, K
is the spring constant, and D is the damping constant. As muscles can only
pull and not push the additional condition, F ≥ 0, is added. The force, FM ,
generated by the motor on the kite line is calculated as:
FM =
τLG
r
(3)
where r is the radius of the motor shaft, and τLG is the torque generated by
the gear box. This torque is calculated as:
τLG = NητLM − τCG − ωG µG − JG
dωG
dt
(4)
where, N is the gearbox ratio, η is the gearbox efficiency, τCG is the Coulomb
friction gearbox torque, µG is the gearbox viscous friction constant and JG
is the gearbox inertia, and τLM is the motor load torque. The motor load
torque is given by:
τLM = KT .i − JM
dωM
− µM ωM − τCM
dt
(5)
where KT is the torque constant, i is the motor current, JM is the motor
inertia, µM is the motor viscous friction constant , ωM is the motor angular
velocity, and τCM is the Coulomb motor friction torque. We used the same
parameters as those identified in [32] using Multiple Regression Analysis from
observed data. The parameters used in our simulation are shown in Table 4.1
Parameter
KP E
C
K
D
r
N
η
τCG
µG
JG
τLM
KT
JM
µM
τCM
Value
1
106
10000
500
0.005 m
100
0.09618
0.001 x
0.001
2e-8
0.001
7.99e-3
1e-6
0.001
0.001
4.2
Self-organization process
The main purpose of the self-organization process is to identify the relationship between sensor and motor activity (step II in Fig.1). This is done in
two stages. In the first stage we identify which motor is connected to each
sensor; in the second stage, we quantify the strength of these connections
and characterize their nature (excitatory or inhibitory).
In the first stage we normalize the sensor and motor activity as follows:
Sj =
(
0, if Nmin ≤ Ṡj ≤ Nmax
1, otherwise
,
Mi =
(
0,
1,
if Mi = 0
otherwise
(6)
where Sj is the normalized activity of sensor j, Ṡj is the derivative of sensor
j, Mi is the normalized activity in motor i, and Mi is the signal sent to
motor i. Nmin and Nmax are the noise boundaries extracted from the sensor
signals collected when the system is at rest (i.e. with no motor activation).
The connectivity is calculated as:
Ci,j
xCorr(Sjl , Mi )
= max
, l = 0, ...., maxLag
xCorr(M0i , M0i )
(7)
where xCorr(X, Y ) is the cross-correlation between signals X and Y , Sjl is
the normalised sensor signal j lagged l samples, and maxLag is the maximum
lag allowed for the sensor data. .
This connectivity is then thresholded. Values above a certain threshold,
T , establish a connection between a given sensor and a given motor, and
values below this threshold establish no such connection:
(
1, if Ci,j > T
(8)
Ai,j =
0, otherwise
where A is represents the adjacency matrix with the connectivity between
sensors and actuators. The connectivity is characterised and quantified as:
Qi,j = − R t
0
Rt
0
Ṡj dt
Mi dt.max(Ṡj )
(9)
where Q is the final connectivity matrix. Excitatory connections are characterized by a positive value and inhibitory connections by a negative value.
The strength of the connection is given by its magnitude. In reality this process is very similar to that of differential Hebbian learning [38] using standard
cross correlation; but our method allows for a more intuitive way of setting
the connectivity threshold.
4.3
Reflex Activity
Intuitively, the connectivity in Q describes motor-to-sensor connections, as
the directed flow of information is from motors to sensors. However, Q can
also describe directed sensor-to-motor connectivity. This process is based
on the idea of motor-directed somatosensory imprinting (MSDI) proposed
in [19]. This formulation can be seen as a reverse type of Hebbian learning
where “the post-synaptic activity in reflex interneurons [and motoneurons]
precedes the afferent input” [19].
In this way the reflex activity is given by the external sensor stimulation measured in each sensor, weighted by the respective sensor-to-motor
connection:
m
X
Ṡj
Mi =
Qi,j
)
(10)
max(
Ṡ
j
j=1
To allow for a stable simulation we compute the motor activity based on
an average of five samples of the sensor derivative signals. The collection of
the five samples starts as soon as one derivative is identified that goes above
the thresholds identified during the learning stage – Nmin and Nmax . The
motor signal is kept constant for the next five samples, time at which sensor
readings are averaged and a new motor signal is computed.
To produce the external stimulus we contract the external muscles, Ext1
and Ext2, in each leg. Each of these muscles produces a given displacement
in one of the limbs. This displacement is then caught by the different muscle
sensors and converted into reflex motor activity using the matrix Q.
4.4
Experiment 1
The main goal of Experiment 1 is to verify that the reflex connectivity can
by learned using SMTs and the self-organization process mentioned above.
This experiment is carried out using the Hill muscle model (see Section 4.1).
The simulation starts with all the muscles relaxed. In this condition the
legs fall straight down due to the effects of gravity. We start the learning
process by performing extremely small contractions in all the leg muscles to
produce some sensor noise. We have tried different perturbations ranging
from one order of magnitude lower than the magnitude of the STMs to half
of the magnitude of the muscle twitches. No significant effects were observed
on the results obtained. During this time sensor data is collected and the
noise boundaries (Nmin and Nmax ) identified. The noise collection lasts for
20 seconds.
After the noise collection interval we perform a number of SMTs in each
muscle (30 in total). This is done by continuously selecting a muscle randomly (from a uniform distribution) and contracting the selected muscle with
a constant activation for a short period of time. After a SMT the system
waits a fixed time to select the next muscle to twitch. The waiting time is
fixed to a value large enough to allow the system to stabilise, i.e. to stop
oscillate. During the SMTs the sensor and motor signals are collected at a
rate of 20Hz.
The testing of the reflexes is then carried out by contracting each one of
the external muscles and allowing the reflexive behavior to be expressed.
4.5
Experiment 2
The goal of Experiment 2 is to investigate the role of external perturbations
during the learning stage (i.e. during the SMTs) and validate the hypothesis that perturbations affect significantly the reflex connectivity obtained
(see third necessary condition in Section 3). For this purpose we repeated
Experiment 1 but now we introduce small perturbation in the system with
a given probability. We varied the perturbation probability from 0 to 1 at
0.05 intervals. The perturbations consisted of short contractions with constant intensity (half the magnitude of the SMT) carried out by the external
Motor activity during SMT
MRQ
2
1.5
1
0.5
0
MRB
2
1.5
1
0.5
0
MLQ
2
1.5
1
0.5
0
MLB
2
1.5
1
0.5
0
0
100
200
300
400
500
600
700
Time (s)
Figure 5: Motor activity produced by SMTs.
muscles .
4.6
Experiment 3
The goal of Experiment 3 is to investigate the impact of a non-biologically
inspired muscle model – the ECCEROBOT muscle model – on the reflex
connectivity acquired (Section 3). For this purpose we repeated Experiment
1 using the ECCEROBOT muscle model described in Section 4.1. There are
two main differences between the two muscle models. First, contrary to the
biological muscles, the ECCEROBOT muscles do not passively extend when
relaxing; extension is limited to the extension of the shock cord in series with
the motor. Second, the ECCEROBOT muscles do go slack when the motor
unwinds beyond the muscle resting length.
5
5.1
Results
Experiment 1
Figure 5 shows the raw motor activity produced by the SMTs. As can be seen
the twitches occur only one at the time. Figure 6 shows the raw force data
collected for the two muscles of the right leg during a SMT carried out by the
Quadriceps (left) and by the Biceps (right). As can be seen activity in the
Quadricep Femoris
Bicep Femoris
7
7
M
M
RQ
6
RB
6
F
RQ
5
RB
4
3
2
2
0
0
1
1.5
2
2.5
3
Time (s)
3.5
4
4.5
RQ
3
1
0.5
F
4
1
−1
0
RB
5
F
Force (N)
Force (N)
F
−1
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Time (s)
Figure 6: Raw force responses of the force sensors of the Quadriceps (blue)
and Biceps (green) for the right leg in response to a SMT (red) carried out
by the by the Quadriceps (left) and by the Biceps (right).
force sensor is only observed during contractions of the homonymous muscle
but not during contractions of the antagonist muscle. This is because when
a muscle is relaxed the only force in the muscle is due to the passive element
which has a negligible magnitude when compared with the active force that
can be produced by the muscle. Similar results have been observed for the
left leg.
Figure 7 shows the raw length data collected for the two muscles of the
right leg during a SMT carried out by the Quadriceps (left) and by the
Biceps (right). Contrarily to the force sensors, the length sensors change
their values in response to contractions of the homonymous as well as the
antagonist muscles. Similar behaviour have been observed for the left leg.
The data collected suggests a connectivity between the force sensors and
their homonymous motor, and a connectivity between the length sensors and
both their homonymous and antagonist muscles. Figure 8a shows that this is
in fact the connectivity obtained. As expected no connectivity exists between
muscles in one leg and sensors from the other. The quantified matrix is shown
in Fig. 8b. As can be seen the connectivity obtained is in qualitative terms
similar to that observed in rofelation to human reflexes. First, we obtain
excitatory connections between the length sensors and their homonymous
muscles (as in the Myotatic reflex). Second, we obtain inhibitory connections
Length (cm)
Quadricep Femoris
2
1
0
M
−1
−2
0
L
RQ
2
4
6
8
10
12
L
RQ
14
RB
16
18
Length (cm)
Time (s)
Bicep Femoris
2
1
0
−1
M
−2
−3
0
L
RB
2
4
6
8
10
12
L
RB
14
RQ
16
18
Time (s)
Figure 7: The activity of the length sensors relative to the Quadriceps (blue)
and Biceps (green) in the right leg in response to a SMT (red) carried out by
the by the Quadriceps (left) and by the Biceps (right). For clarity, we have
subtracted the resting lengths to the total length of each muscle.
between the length sensors and their antagonist muscles. And third, we obtain inhibitory connections between the force sensors and their homonymous
muscles.
We tested the muscle activity in response to a muscle stretch caused by
the artificial muscles. Figure 9 shows the muscle activity when an external
load increases the length of the Quadriceps muscle of the right leg. As can
be seen the first reaction to the muscle stretch causes the Quadriceps to
contract, and at the same time inhibits the Biceps from contracting. The
first activation of the Quadriceps slows down the length increase and causes
a drop in the following activation. This behavior is consistent with that
observed in relation to the Myotatic and Reciprocal Inhibition reflexes.
To investigate the effect of the Reverse Myotatic in the decrease of muscle
activity we repeated the experiment with and without the using the force
sensor, and compared the muscle activations in both conditions. We observed
that the decrease in activity is mainly due to the decrease drop in the muscle
increase caused by the first activation then due to the Myotatic reflex. The
drop due to the force component is in fact very small - 0.25 motor units.
Although small the behavior is also consistent with the Reverse Myotatic
reflex (see Section 6.1 for discussion).
Adjacency Matrix
MRQ
MRB
MLQ
MLB
FRQ
F
RB
FLQ F
L
LB
RQ
L
LRB L
LB
LQ
(a)
Quantified Adjacency Matrix
MRQ
MRB
MLQ
MLB
FRQ
FRB FLQ FLB L
RQ
LLB
LRB L
LQ
(b)
Figure 8: The adjacency matrices A (a) and Q (b). M stands for motor, F
stands for force sensor and L for length sensor; in the subscripts, RQ stands
for Quadriceps of the right leg, RB Biceps of the right lef, LQ Quadriceps of
the left leg, and LB Biceps of the left leg. Black blocks represent excitatory
connections and white blocks represent inhibitory connections.
Quadriceps Femoris Reaction
Load
40
20
0
LRQ
40
35
30
FRQ
20
10
0
M
RQ
2
1
0
MRB
0
−0.2
−0.4
0
1
2
3
4
Time (s)
Figure 9: Muscle activity generated in reation to a stretch in the Quadriceps
caused by an external load.
Quantified Adjacency Matrix
MRQ
MRB
M
LQ
M
LB
FRQ
LLB
L
L
FRB FLQ FLB L
RB
LQ
RQ
Figure 10: Quantified Adjacency Matrixc with Perturbations
5.2
Experiment 2
In Experiment 2 we investigate the effect of perturbations during the selforganization process. Our results indicated that for a probability of perturbations higher than 0.1 we cannot obtain the right reflex matrix using our
framework. This is because external perturbations induce information in the
sensor signals which interfere with those produced by the SMTs. Figure. 10
shows the connectivity obtained for perturbations of 0.2. As can be seen
connections are mistakingly identified between length sensors in the right leg
and muscles in the left leg. In addition, the system presents several missing
connections between length sensors and homonymous and antagonist muscles
in both legs. The force connectivity is correct because the force values are
only significant when the homonymous muscles contract (as in biology).
5.3
Experiment 3
Experimnet 3 investigates the effect of using a different muscle model on
the self-organization process. For this experiment we used the ECCEROBOT muscle model in which a non-back drivable motor produces a strong
resistance to movement when no voltage is applied. Figure. 11 shows the
connectivity obtained. As can be seen the connectivity between the length
sensors is the same as in the human muscle model. This connectivity is valid
Quantified Adjacency Matrix
MRQ
MRB
M
LQ
M
LB
FRQ
LLB
L
L
FRB FLQ FLB L
RB
LQ
RQ
Figure 11: Quantified Adjacency Matrix with ECCEROBOT Muscle Model
since when one muscle contracts it extends the shock cord of the antagonist
and decreases the length of the homonymous muscle. On the force sensor
connectivity we observe an extra inhibitory connection between force and
the antagonist muscle. This occurs because when contracting one muscle
the tension on the antagonist muscle will also increase due to the lack of
back-drivability in the motors.
6
6.1
Discussion
Timing, thresholding, linearity and modulation of
reflex activity
The main goal of this paper is not so much to demonstrate appropriate
human-like reflex behaviour but rather to show that the correct reflex connectivity can emerge from SMT. When it comes to behavioral expression even
the simplest reflexes (as those investigated here) is far from trivial. First,
different reflexes have different activity thresholds, for example the Reverse
Myotatic reflex seems to require a significantly higher threshold to be activated than the Myotatic reflex. At the moment it is not clear how this can
be developed in an automated way.
Second, different reflexes appear at different times depending on the num-
ber of interneurons they entail. For example, the Reverse Myotatic reflex
takes longer to be active than the Myotatic reflex because it entails one interneuron in its path while the Myotatic entails none. In our platform the
Reverse Myotatic reflex also appears later but this is due the fact that force
only appears after the muscle starts contracting; this is also valid in humans.
Third, it is quite unlikely that the motor signal is a linear combination
of the sensor values. Typically a non-linear function such as the sigmoid is
used. Although that can easily be incorporated into our system, we see no
direct benefit in doing so at such a preliminary stage of our work.
Forth, and most important aspect, all the reflex connectivity can be modulated by the supra-spinal and the central nervous systems (REF). This
means that the gains of all the connection weights can be manipulated from
hierarchically superior systems, allowing for different reflex activity in different behaviors. This is important because it reduces the relevance of the
exact strength of the connectivity, and places the highest emphasis on the
nature of the connectivity identified (inhibitory or excitatory) which cannot
be modified. Nonetheless, we believe that in artificial systems it is relevant
to identify at least an appropriate order of magnitude for the weights which
is in accordance to the range of the raw sensor data.
6.2
Emergence of reflex activity using the ECCEROBOT muscle model
In the experiment carried out using the ECCEROBOT muscle model we can
see that our approach scales rather well to produce relevant reflex activity.
In the ECCEROBOT model, negative motor signals (or the inhibition of
motor activity) will cause the motors to drive backwards, allowing for an
active extension of the muscles. Although the process of muscle extension
differs from that in humans it produces a similar consequence, i.e. it extends
the muscle. Relative to the Myotatic and Reciprocal Inhibition reflexes the
behavior of the artificial muscle mode is expected to produce similar behavior;
the contraction of one muscle will lead to the active extension of the other.
The difference is on the Reverse Myotatic reflex which now has an extra
inhibitory connection with the antagonist muscle. This connection is actually
relevant in the platform as it prevents high force increases not only in the
homonymous but also in the antagonist muscles. As the ECCEROBOT
motors are not back-drivable forces produced by one muscle will necessarily
be reflected on the antagonist muscles. The extra inhibitory connection is
then essential to actively relax the antagonist muscle and prevent high force
increases there.
7
Conclusions
The main goal of this paper is to show that one can obtain reflex connectivity
(analogue to that observed in the human spinal cord) using a self-organization
process and a simple babbling strategy consisting of SMTs. We have shown
that our strategy works for three different reflexes: the Myotatic Reflex, the
Reverse Myotatic Reflex and the Reciprocal Inhibition reflex. In addition
we have shown that the correct connectivity can only be acquired when the
influence of external perturbations during the learning stage is limited. In
addition we have shown that different muscle models potentiate different
connectivity, which suggests that the reflex connectivity in humans reflects
the nature and geometry of the human muscular-skeletal system.
References
[1] M. Asada, K. F. MacDorman, H. Ishiguro and Y. Kuniyoshi Cognitive
developmental robotics as a new paradigm for the design of humanoid
robots. Robotics and Autonomous Systems 37:185-93, 2001.
[2] M. Asada, K. Hosoda, Y. Kuniyoshi, H. Ishiguro, T. Inui, Y. Yoshikawa,
M. Ogino and C. Yoshida. Cognitive Developmental Robotics: A Survey.
In IEEE Transactions on Autonomous Mental Development 1(1):12-34,
2009.
[3] M. Lungarella, G. Metta, R. Pfeifer and G. Sandini. Developmental
robotics: a survey, Connection Science 15(4):151-190, 2003.
[4] H. F.R. Prechtl. Qualitative changes of spontaneous movements in fetus
and preterm infant are a marker of neurological dysfunction,
Early
Human Development 23:151-8, 1990.
[5] M. Hadders-Algra. Putative neural substrate of normal and abnormal
general movements. Journal of Pediatrics 145:S12-S16, 2004.
[6] A. B. Lchinger, M. Hadders-Algra, C. van Kan and J. de Vries. Fetal
Onset of General Movements. Pediatric Research 63:191-5, 2008.
[7] B. S. Kisilevsky and J. A. Low. Human Fetal Behaviour: 100 Years of
Study. Developmental review 18:1-29, 1998.
[8] C. M. van Kan, J. I.P de Vries, A. B. Lchinger, E. J.H. Mulder and
M. A.M. Taverne. Ontogeny of fetal movements in the guinea pig. Physiology and Behaviour 98:338-44, 2009.
[9] H. F.R Prechtl, C. Einspieler, G. Cioni, A. F. Bos, F. Ferrari and D. Sontheimer. An early marker of developing neurological handicap after perinatal brain lesions. Lancet 339: 1361-1363, 1997.
[10] S. E. Groen, A. C.E. de Blcourt, K. Postema, M. Hadders-Algra. Quality
of general movements predicts neuromotor development at the age of 912 years.
Developmental Medicine and Child Neurology 47: 731-8,
2005.
[11] M. Hadders-Algra. Putative neural substrate of normal and abnormal
general movements. Neuroscience and Biobehavioural Reviews 31:118190, 2007.
[12] O. Sporns and G. M. Edelman. Solving Bernsteins Problem: A Proposal
for the Development of Coordinated Movement by Selection.
Child
Development 64:960-81, 1993.
[13] W. P. Smotherman and S. R. Robinson. The Development of Behaviour
Before Birth. Developmental psychology 32(3):425-434, 1996.
[14] L. Berthouze and Y. Kuniyoshi. Emergence and Categorization of Coordinated Visual Behaviour Through Embodied Interaction. Machine
Learning 31:187-200, 1998.
[15] K. M. Newel and D. E. Vaillancourt. Dimensional change in motor learning. Human Movement Science 20:695-715, 2001.
[16] L. Berthouze and M. Lungarella. Motor Skill Acquisition Under Environmental Perturbations: On the necessity of Alternate Freezing and
Freeing Degrees of Freedom. Adaptive Behaviour 12(1):47-64, 2004.
[17] Y. Kuniyoshi and S. Sangawa. Early motor development from partially ordered neural-body dynamics: experiments with a cortico-spinalmusculo-skeletal model. Biological Cybernetics 95:589-605, 2006.
[18] H. Mori and Y. Kuniyoshi. A human fetus development simulation: Selforganization of behaviors through tactile sensation.
In IEEE 9th International Conference on Development and Learning p.82-7, 2010.
[19] P. Petersson, A. Waldenstrm, C. Fhraeus and J. Schouenborg. Spontaneous muscle twitches during sleep guide spinal self-organization, Nature
424, 72-75, 2003.
[20] S. Grillner. Muscle twitches during sleep shape the precise modules of
the withdrawal reflex. TRENDS in Neurosciences 27(4):169-71, 2004.
[21] M. Lungarella and O. Sporns. Mapping Information Flow in Sensorimotor Networks. PloS Computational Biology 2(10):1301-12, 2006.
[22] E. Bullmore and O. Sporns. Complex brain networks: graph theoretical
analysis of structural and functional systems. Nature Reviews Neuroscience 10:186-198, 2009.
[23] R. Pfeifer, M. Lungarella and F. Iida. Self-Organization, Embodiment,
and Biologically Inspired Robotics. Science 318:1088-93, 2007.
[24] A. A. Penn and C. J. Shatz. Brain waves and brain wiring: the role of
endogenous and sensory-driven neural activity in development. Pediatrics Research 45:447-58.
[25] P. Rochat, Self-perception and action in infancy,
Research, pp. 102109, 1998.
Experimental Brain
[26] P. Rochat and T. Striano, Perceived self in infancy,
Development, pp. 513530, 2000.
Infant Behavior
[27] N. Kudo and T. Yamada. Development of the monosynaptic stretch reflex
in the rat: an in vitro study. Journal of Physiology 369:127-44, 1985.
[28] B. Myklebust and G. Gottlieb. Development of the stretch reflex in the
newborn: Reciprocal excitation and reflex irradiation. Child Development 64(4):1036-45, 1993.
[29] A. Levinsson, M. Garwicz and J. Schouenborg. Sensorimotor transformation in cat nociceptive withdrawal reflex system. European Journal
of Neuroscience 11:4327-32, 1999.
[30] M. Bear, B. Connors and M. Paradiso. Neuroscience.
Lippincott-Williams and Wilkins(2nd ed.), 2001.
S. Katz (ed.),
[31] H-H. Chen, S. Hippenmeyer, S. Arber and E. Frank. Development of the
monosynaptic stretch reflex circuit. Current Opinion of Neurobiology
13:96:102, 2003.
[32] S. Wittmeier, M. Jäntsch, K. Dalamagkidis and A. Knoll. Physics-based
Modeling of an Anthropomimetic Robot IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011.
[33] H. Geyer and H. Herr. A Muscle-Reflex Model That Encodes Principles
of Legged Mechanics Produces Human Walking Dynamics and Muscle
Activities.
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND
REHABILITATION ENGINEERING, VOL. 18, NO. 3, 2010.
[34] E. Coumans. Bullet Physics Library. Sony Computer Entertainment.
[Online]. Available: http://www.bulletphysics.com
[35] Embodied Cognition In A Compliantly Engineered Robot (ECCEROBOT). [Online]. Available: http://www.eccerobot.eu
[36] M. Berniker. Linearity, Motor Primitives and Low-Dimensionality in
the Spinal Organization of Motor Control. Unpublished doctoral dissertation, MIT USA, 1971.
[37] F. E. Zajac. Muscle and tendon: properties, models, scaling and application to biomechanics and motor control. Critical Reviews in Biomedical
Engineering, vol. 17, no. 4, pp. 359410, 1989.
[38] B. Kosko. Differential hebbian learning,
vol. 151, no. 1, pp. 27782, 1986.
AIP Conference Proceedings,
[39] A. V. Hill The heat of shortening and dynamics constants of muscles.
Proc. R. Soc. Lond. B (London: Royal Society) 126 (843): 136195, 1938.
[40] Open Graphics Library.[Online]. Available: http://www.opengl.org
[41] O. Holland and R. Knight. The anthropomimetic principle. J. Burn and
M. Wilson (eds.) Proceedings of the AISB06 Symposium on Biologically
Inspired Robotics, 2006.
[42] J. L. Elman. Learning and development in neural networks.
48:71-99, 1993.
[43] D. Hebb. The organization of behavior.
1949.
Cognition
Wiley and Sons, New York,
[44] S. F. Giszter and W. J. Kargo, Modeling of dynamic controls in the frog
wiping reflex: Force-field level controls,
Neurocomputing, pp.12391247, 2001.
Bio-Inspired Robotics Lab
Prof. Dr. F. Iida
Title of work:
Emergence of Reflexive Behavior from Single
Muscle Twitches
Thesis type and date:
Master Thesis, August 2011
Supervision:
Prof. Dr. F. Iida
Dr. H. Gravato Marques
Student:
Name:
E-mail:
Legi-Nr.:
Semester:
Farhan Imtiaz
imtiazf@student.ethz.ch
06-936-246
Final
Download