NeuralEssay

advertisement
CS 6482
Neural Networks
In
Cognitive Science
Critical Evaluation
of
A Cerebellar Neural Network Implementation of a Temporally Adaptive Conditioned
Response
by
Moore, J. W. & Desmond E. J. (1992)
Submitted
By
Anastasios Palaiologou
MSc in Cognitive Science
Having a background in psychology and especially having spent a year
assisting at a behavioral neuroscience laboratory, I was inclined to examine a related
phenomenon. My initial goal was to propose a model on my own. The phenomenon I
had chosen was the Dual- Process theory by Thompson and Groves (1973). The
theory attempts to clarify the temporal and behavioral patterns and the relationship
between two forms of non- associative learning, namely habituation and sensitization.
These phenomena have been extensively studied, either in the context of this said
theory or outside it, in the context of behavioral systems in a surprising variety of
organisms, ranging from invertebrate sea slugs (aplysia californica) to vertebrate nonhuman organisms (several species of fishes strike me right now) to humans included.
However, I soon found myself facing a considerable problem. Research for these
phenomena was conducted at several levels of scientific enquiry, and more
specifically they were behavioral, neurological, cellular and molecular level
explanations. Although the concepts and this research were mostly familiar, it looked
like I underestimated the complexity of the task of proposing a neural network model
for these phenomena and the dual- process theory in particular.
The reason for this rather lengthy introduction is to serve as an explanation for
why I finally decided to do the less arduous and creative task of critically evaluating
an already existing model. Nonetheless, I remained put in my original motivation and
the connectionist model to be examined simulates a learning phenomenon. The model
in question was developed by Moore and Desmond (1992) in the paper " A Cerebelar
Neural Network Implementation of a Temporally Adaptive Conditioned Response".
The purpose of the model is to model the development and generation of a
2
conditioned response (CR) with temporally adaptive properties. The behavioral
system modeled is the conditioned nictitating membrane response (NMR) in the
rabbit. Moore and Desmond (1992) subsequently adapt their model to neurological
data suggesting that the cerebelum and the brain stem are the neurobiological
substrates of the behavior in question and timing behavior in general. Having in mind
the extensive nature of their model, my focus will be on the part of the model
simulating the behavioral properties of the CR topography choosing to leave out the
neurobiological implications.
The first section of the essay will describe the phenomenon being modeled.
Subsequently, the model itself and its basic assumptions will be examined. Finally,
experimental tests and simulations run with the model will be considered with regards
to experimental data on the phenomenon in question, as well as the model's ability to
generalize and predict other behavioral phenomena.
The phenomenon of a temporally adaptive CR is traditionally referred to in
Pavlovian classical conditioning as inhibition of delay. In particular inhibition of
delay occurs when the topography (onset latency) and the nature (form) of the
response adapts to the temporal features in the presentation of the experimental
stimuli, the uncoditioned stimulus (UCS) and the conditioned stimulus (CS).
It essential to elaborate on some of the properties of this phenomenon before
proceeding to describe the model. The most prominent feature is that the topography
of the CR progressively changes with increasing acquisition trials. At the initial
phases of training the CR occurs at the beginning of the CS- UCS interval, preceding
3
the CS. At that point the CR is manifested as a part of the orienting reflex or the
unconditioned response (UCR). As training progresses the CR is getting progressively
differentiated from the UCR as it moves towards the onset of the CS. After that the
temporal course of the event is more or less common with minor variations depending
on the conditioning paradigm employed (delay vs. trace conditioning). The CR
topography is changing as the CR moves towards the onset of the UCS. This is
demonstrated by the peak CR occurring prior to the CS. The time course of the
phenomenon is dependent on the experimental parameters employed, such as the
interstimulus and intertrial intervals (ISI & ITI), and the number of acquisition trials.
Having studied this phenomenon during my final- year undergraduate project
(Palaiologou, 1999), in the Branchial Defense Reflex (BDR) in the species of goldfish
Carasius Auratus, my results showed that the phenomenon was more evident for
longer training periods (more acquisition trials) and longer ISIs in both delay and
trace conditioning paradigms.
From the above description it can be deduced that the organism is learning an
energy saving efficient adaptive response. The organism learns to provide the
response when needed, that is when the noxious or hazardous stimulus is to be
presented. This form of adaptive learning is not evident in experimental paradigms
involving invertebrates, and is believed to be a form of learning characteristic of
organisms higher in the evolutionary ladder. Apart, from the NMR in the rabbit, this
phenomenon has also been established in the goldfish (as mentioned above) as well as
in human eye-blink conditioning.
4
The model developed by Moore and Desmond (1992) is temred VET and is
composed of two nodes that use Hebian learning rules. The first unit (V) is
responsible for providing the network output and computing association values
between the stimuli (CS & UCS). The second neuron (E) is responsible for the
temporal computations by computing the expected UCS arrival time. Additionally, it
is connected to the V node in order to reinforce the values computed by that unit. The
input to the network comes via a tapped delay line (Moore and Desmond, 1992) so
that temporal computations can take place. The model is centered around one basic
assumption; processes that perform temporal computations are caused by changes in
the activation of the CS (onset & offset). Since the input is provide by the tapped
delay lines the onset of the CS will active a different a different tapped line.
Based on the aforementioned assumption then each input will be represented
by: xijk, where i denotes the activating CS and i =1. . . . n, j denotes the activation state
of the CS, where j=1 denotes onset and j=0 offset, k denotes the position in the delay
line, where N is the total. Therefore the inputs are given by a nx 2x N matrix. During a
given time t therefore an input xijk will be either on (xijk (t) = 1) or off (xijk (t) = 0) and
will remain like for a number of pre- defined iterations (time steps). All the input
elements connect with both nodes, V and E. In addition the two processing units
receive inputs from the UCS (L (t)), while the V also receives regulating input by E (r
(t)) as was already mentioned. The output s (t) of the network provided by the V node
is given by the following equation:
s (t)= Σi Σj Σk Vijk (t) xijk (t) + L (t).
5
As it can be seen network output is the result of the weighted sum of the CS inputs
and the UCS input. The weight change in a Vijk unit is the subtraction of the UCS
inputs from the network output and is given by the equation:
Δ Vijk (t) = c {L (t)- s^ (t)}hijk (t) xij- (t)r (t),
where c is a learning rate parameter, hijk (t), xij- (t) represent control conditions that
need to be met for the weights to change, and finally r(t) is the output of the node E
given by the equation:
r (t) = max {[Eijk (t)Δxijk (t) |i= 1 . . . , n; j= 0.1; k= 1, . . . , N],
where Δxijk (t) = 1 if xijk (t)- xik (t- 1)= 1; or xijk= 0 otherwise. The change in the input
weights Eijk for the E node are given by the equation:
ΔEijk (t)= c [L (t)- r (t)]Δxijk (t)xij- (t).
The term Δxijk (t) is this equation is equivalent to the term hijk in the equation defining
the weight change for node V. The purpose of this term as defined by Moore and
Desmond (1992) controls the time interval during which weights can change for these
two nodes. So, whilst weights can change for the V synapses for a longer predefined
time, synaptic change in the E node occurs for just one iteration, that being only when
UCS is presented.
According to Moore and Desmond (1992) the result of the
equations concerning node E is to activate node V when the UCS is imminently
expected.
6
The first simulation the model was tested was a relative straightforward delay
conditioning experiment, involving acquisition training and subsequent extinction of
the response. The results clearly demonstrate the behavior described earlier in the
essay, and are in agreement with traditional psychological literature on the issue
(Hilgard, 1956; Gormezano, 1969; Prokasy, 1987).
The next simulation provides valuable insights into a question highly
researched and debated in classical conditioning research. This question regards the
nature of the CR. More specifically, during a conditioning experiment several types of
responses are manifested and the experimenter is faced with the task of distinguishing
among them. While originally responses have been distinguished on the grounds of
the stimulus by which they were elicited (an UCS naturally elicits a UCR, while the
CS after training elicits a CR) the distinction has proven not be that clear cut after all.
Attempts to distinguish between responses in terms of onset latencies and shape of the
response have also fallen short as experimental paradigms became more complicated
and phenomena like inhibition of delay have been studied more extensively. Konorski
refined the different latencies assumption by distinguishing between two types of
CRs. The first class is 'preparatory' CRs, these being CRs reflecting the organism's
general motivational state (Fantini & Logan, 1979). On the other hand 'consumatory'
CRs are specific responses related to certain type of behavior such as avoidance
behavior (ibid.). When these response are exhibited in a situation, preparatory
responses need shorter acquisition trials to get established whilst they are also
characterized by shorter latencies than consumatory responses. The model in question
appears to be sensitive to this kind of differences between responses. When run with
7
different ISIs, the development of CRs is in accordance with the traditional
psychological findings of responses of different latencies and overall characteristics.
With short ISIs an early onset response occurs. When a longer latency is used this
short- latency response is accompanied by a longer latency one (Moore and Desmond,
1992).
The writers also provide a series of simulations during which the phenomenon
of inhibition is addressed. Blocking experiments and compound stimuli experiments
are replicated and the results are again comparable to original psychological research.
This is a rather interesting feature of the model since modeling these phenomena was
not in the initial intentions of the authors. However the models emergent behavior
compensated and account for these types of inhibition phenomena.
Altogether, Moore and Desmond provide a rather adequate model of the
phenomenon in question. Particularly, the model adequately replicated experimental
evidence on inhibition of delay. Changes in response topography followed the
temporal course reported in the literature. The major drawback however was that
although the time aspects of the phenomenon were adequately modeled the changes in
the nature and form of response weren't. All the responses occurring were of the same
amplitude and shape, although it is well established in the literature that the form of
the CR is of extreme importance. Nonetheless, it was clearly stated by the authors that
the purpose of the neural network was to model the timing behavior of a temporally
adaptive response rather than any other feature of it. To finalize, an aspect of interest
in this specific model was that the network actually behaved realistically in situations
and conditions for it was not designed to account for.
8
BIBLIOGRAPHY
Fantino, E., & Logan, C. A. (1979). The Experimental Analysis of Behavior: A
Biological Perspective. San Fransisco: W. H. Freeman and Company.
Gormezano, I. & Moore, J. W. (!969). Classical Conditioning. In M. H. Marx (Ed.),
Learning: Processes. London: Collier- MacMillan Limited.
Groves, P., & Thompson, R. F. (1973). A Dual-Process Theory of Habituation. In H.
Peeke & M. Hertz (Eds.), Habituation II: Neural Substrates. New York: Academic
Press.
Hilgard, E. R. (1956). Theories of learning. New York: Appleton- Century- Crofts.
Millenson, J. R., Kehoe, E. J., & Gormezano, I. (1977). Classical Conditioning of the
Rabbits Nictitating Membrane Response Under Fixed and Mixed CS- US Intervals.
Learning and Motivation, 8, 351- 366.
Moore, J. W. & Desmond E. J. (1992). A Cerebellar Neural Network Implementation
of a Temporally Adaptive Conditioned Response. In Gormezano I. & Wasserman E.
A. (Eds.). Learning and Memory: The Behavioral and Biological Substrates.
Hillsdale, New Jersey: Lawrence Erlbaum Associates.
9
Palaiologou, A. (1999). Inhibition of Delay as a Function of Acquisition Trials and
Conditioning Paradigm of the Branchial Defense Reflex (BDR) in the Goldfish
(Carasius Auratus). Final Year Undergraduate Project for the BA in Psychology.
Prokasy, W. F. (1987). A Perspective on the Acquisition of Skeletal Responses
Employing the Pavlovian Paradigm. In Gormezano, I., Prokasy, W. F, & R. F.
Thompson (Eds.), Classical Conditioning III (pp. 287- 318). Hillsdale, New Jersey:
Lawrence Erlbaum Associates.
Rescoral, R. A., & Wagner, R. A. (1972). A Theory of Pavlovian Conditioning:
Variations in the Effectiveness of Reinforcement and nonreinforcement. In Black., A.
& Prokasy, W. F. (Eds.), Classical Conditioning II: Current Theory and Research.
New York: Appleton- Century- Crofts.
Thompson, R. F. (1986). The Neurobiology of Learning and Memory. Science, 233,
941- 947.
Thompson, R. F., & Spencer, W. A. (1966). Habituation: A model Phenomenon for
the Neural Substrates of Behavior. Psychological Review, 173, 16- 43.
Thompson, R. F., Groves, P., Teyler, T., & Roemer, R. (1973). A Dual-Process
Theory of Habituation. In H. Peeke & M. Hertz (Eds.), Habituation I: Behavioral
Studies. New York: Academic Press.
10
11
Download