CS 6482 Neural Networks In Cognitive Science Critical Evaluation of A Cerebellar Neural Network Implementation of a Temporally Adaptive Conditioned Response by Moore, J. W. & Desmond E. J. (1992) Submitted By Anastasios Palaiologou MSc in Cognitive Science Having a background in psychology and especially having spent a year assisting at a behavioral neuroscience laboratory, I was inclined to examine a related phenomenon. My initial goal was to propose a model on my own. The phenomenon I had chosen was the Dual- Process theory by Thompson and Groves (1973). The theory attempts to clarify the temporal and behavioral patterns and the relationship between two forms of non- associative learning, namely habituation and sensitization. These phenomena have been extensively studied, either in the context of this said theory or outside it, in the context of behavioral systems in a surprising variety of organisms, ranging from invertebrate sea slugs (aplysia californica) to vertebrate nonhuman organisms (several species of fishes strike me right now) to humans included. However, I soon found myself facing a considerable problem. Research for these phenomena was conducted at several levels of scientific enquiry, and more specifically they were behavioral, neurological, cellular and molecular level explanations. Although the concepts and this research were mostly familiar, it looked like I underestimated the complexity of the task of proposing a neural network model for these phenomena and the dual- process theory in particular. The reason for this rather lengthy introduction is to serve as an explanation for why I finally decided to do the less arduous and creative task of critically evaluating an already existing model. Nonetheless, I remained put in my original motivation and the connectionist model to be examined simulates a learning phenomenon. The model in question was developed by Moore and Desmond (1992) in the paper " A Cerebelar Neural Network Implementation of a Temporally Adaptive Conditioned Response". The purpose of the model is to model the development and generation of a 2 conditioned response (CR) with temporally adaptive properties. The behavioral system modeled is the conditioned nictitating membrane response (NMR) in the rabbit. Moore and Desmond (1992) subsequently adapt their model to neurological data suggesting that the cerebelum and the brain stem are the neurobiological substrates of the behavior in question and timing behavior in general. Having in mind the extensive nature of their model, my focus will be on the part of the model simulating the behavioral properties of the CR topography choosing to leave out the neurobiological implications. The first section of the essay will describe the phenomenon being modeled. Subsequently, the model itself and its basic assumptions will be examined. Finally, experimental tests and simulations run with the model will be considered with regards to experimental data on the phenomenon in question, as well as the model's ability to generalize and predict other behavioral phenomena. The phenomenon of a temporally adaptive CR is traditionally referred to in Pavlovian classical conditioning as inhibition of delay. In particular inhibition of delay occurs when the topography (onset latency) and the nature (form) of the response adapts to the temporal features in the presentation of the experimental stimuli, the uncoditioned stimulus (UCS) and the conditioned stimulus (CS). It essential to elaborate on some of the properties of this phenomenon before proceeding to describe the model. The most prominent feature is that the topography of the CR progressively changes with increasing acquisition trials. At the initial phases of training the CR occurs at the beginning of the CS- UCS interval, preceding 3 the CS. At that point the CR is manifested as a part of the orienting reflex or the unconditioned response (UCR). As training progresses the CR is getting progressively differentiated from the UCR as it moves towards the onset of the CS. After that the temporal course of the event is more or less common with minor variations depending on the conditioning paradigm employed (delay vs. trace conditioning). The CR topography is changing as the CR moves towards the onset of the UCS. This is demonstrated by the peak CR occurring prior to the CS. The time course of the phenomenon is dependent on the experimental parameters employed, such as the interstimulus and intertrial intervals (ISI & ITI), and the number of acquisition trials. Having studied this phenomenon during my final- year undergraduate project (Palaiologou, 1999), in the Branchial Defense Reflex (BDR) in the species of goldfish Carasius Auratus, my results showed that the phenomenon was more evident for longer training periods (more acquisition trials) and longer ISIs in both delay and trace conditioning paradigms. From the above description it can be deduced that the organism is learning an energy saving efficient adaptive response. The organism learns to provide the response when needed, that is when the noxious or hazardous stimulus is to be presented. This form of adaptive learning is not evident in experimental paradigms involving invertebrates, and is believed to be a form of learning characteristic of organisms higher in the evolutionary ladder. Apart, from the NMR in the rabbit, this phenomenon has also been established in the goldfish (as mentioned above) as well as in human eye-blink conditioning. 4 The model developed by Moore and Desmond (1992) is temred VET and is composed of two nodes that use Hebian learning rules. The first unit (V) is responsible for providing the network output and computing association values between the stimuli (CS & UCS). The second neuron (E) is responsible for the temporal computations by computing the expected UCS arrival time. Additionally, it is connected to the V node in order to reinforce the values computed by that unit. The input to the network comes via a tapped delay line (Moore and Desmond, 1992) so that temporal computations can take place. The model is centered around one basic assumption; processes that perform temporal computations are caused by changes in the activation of the CS (onset & offset). Since the input is provide by the tapped delay lines the onset of the CS will active a different a different tapped line. Based on the aforementioned assumption then each input will be represented by: xijk, where i denotes the activating CS and i =1. . . . n, j denotes the activation state of the CS, where j=1 denotes onset and j=0 offset, k denotes the position in the delay line, where N is the total. Therefore the inputs are given by a nx 2x N matrix. During a given time t therefore an input xijk will be either on (xijk (t) = 1) or off (xijk (t) = 0) and will remain like for a number of pre- defined iterations (time steps). All the input elements connect with both nodes, V and E. In addition the two processing units receive inputs from the UCS (L (t)), while the V also receives regulating input by E (r (t)) as was already mentioned. The output s (t) of the network provided by the V node is given by the following equation: s (t)= Σi Σj Σk Vijk (t) xijk (t) + L (t). 5 As it can be seen network output is the result of the weighted sum of the CS inputs and the UCS input. The weight change in a Vijk unit is the subtraction of the UCS inputs from the network output and is given by the equation: Δ Vijk (t) = c {L (t)- s^ (t)}hijk (t) xij- (t)r (t), where c is a learning rate parameter, hijk (t), xij- (t) represent control conditions that need to be met for the weights to change, and finally r(t) is the output of the node E given by the equation: r (t) = max {[Eijk (t)Δxijk (t) |i= 1 . . . , n; j= 0.1; k= 1, . . . , N], where Δxijk (t) = 1 if xijk (t)- xik (t- 1)= 1; or xijk= 0 otherwise. The change in the input weights Eijk for the E node are given by the equation: ΔEijk (t)= c [L (t)- r (t)]Δxijk (t)xij- (t). The term Δxijk (t) is this equation is equivalent to the term hijk in the equation defining the weight change for node V. The purpose of this term as defined by Moore and Desmond (1992) controls the time interval during which weights can change for these two nodes. So, whilst weights can change for the V synapses for a longer predefined time, synaptic change in the E node occurs for just one iteration, that being only when UCS is presented. According to Moore and Desmond (1992) the result of the equations concerning node E is to activate node V when the UCS is imminently expected. 6 The first simulation the model was tested was a relative straightforward delay conditioning experiment, involving acquisition training and subsequent extinction of the response. The results clearly demonstrate the behavior described earlier in the essay, and are in agreement with traditional psychological literature on the issue (Hilgard, 1956; Gormezano, 1969; Prokasy, 1987). The next simulation provides valuable insights into a question highly researched and debated in classical conditioning research. This question regards the nature of the CR. More specifically, during a conditioning experiment several types of responses are manifested and the experimenter is faced with the task of distinguishing among them. While originally responses have been distinguished on the grounds of the stimulus by which they were elicited (an UCS naturally elicits a UCR, while the CS after training elicits a CR) the distinction has proven not be that clear cut after all. Attempts to distinguish between responses in terms of onset latencies and shape of the response have also fallen short as experimental paradigms became more complicated and phenomena like inhibition of delay have been studied more extensively. Konorski refined the different latencies assumption by distinguishing between two types of CRs. The first class is 'preparatory' CRs, these being CRs reflecting the organism's general motivational state (Fantini & Logan, 1979). On the other hand 'consumatory' CRs are specific responses related to certain type of behavior such as avoidance behavior (ibid.). When these response are exhibited in a situation, preparatory responses need shorter acquisition trials to get established whilst they are also characterized by shorter latencies than consumatory responses. The model in question appears to be sensitive to this kind of differences between responses. When run with 7 different ISIs, the development of CRs is in accordance with the traditional psychological findings of responses of different latencies and overall characteristics. With short ISIs an early onset response occurs. When a longer latency is used this short- latency response is accompanied by a longer latency one (Moore and Desmond, 1992). The writers also provide a series of simulations during which the phenomenon of inhibition is addressed. Blocking experiments and compound stimuli experiments are replicated and the results are again comparable to original psychological research. This is a rather interesting feature of the model since modeling these phenomena was not in the initial intentions of the authors. However the models emergent behavior compensated and account for these types of inhibition phenomena. Altogether, Moore and Desmond provide a rather adequate model of the phenomenon in question. Particularly, the model adequately replicated experimental evidence on inhibition of delay. Changes in response topography followed the temporal course reported in the literature. The major drawback however was that although the time aspects of the phenomenon were adequately modeled the changes in the nature and form of response weren't. All the responses occurring were of the same amplitude and shape, although it is well established in the literature that the form of the CR is of extreme importance. Nonetheless, it was clearly stated by the authors that the purpose of the neural network was to model the timing behavior of a temporally adaptive response rather than any other feature of it. To finalize, an aspect of interest in this specific model was that the network actually behaved realistically in situations and conditions for it was not designed to account for. 8 BIBLIOGRAPHY Fantino, E., & Logan, C. A. (1979). The Experimental Analysis of Behavior: A Biological Perspective. San Fransisco: W. H. Freeman and Company. Gormezano, I. & Moore, J. W. (!969). Classical Conditioning. In M. H. Marx (Ed.), Learning: Processes. London: Collier- MacMillan Limited. Groves, P., & Thompson, R. F. (1973). A Dual-Process Theory of Habituation. In H. Peeke & M. Hertz (Eds.), Habituation II: Neural Substrates. New York: Academic Press. Hilgard, E. R. (1956). Theories of learning. New York: Appleton- Century- Crofts. Millenson, J. R., Kehoe, E. J., & Gormezano, I. (1977). Classical Conditioning of the Rabbits Nictitating Membrane Response Under Fixed and Mixed CS- US Intervals. Learning and Motivation, 8, 351- 366. Moore, J. W. & Desmond E. J. (1992). A Cerebellar Neural Network Implementation of a Temporally Adaptive Conditioned Response. In Gormezano I. & Wasserman E. A. (Eds.). Learning and Memory: The Behavioral and Biological Substrates. Hillsdale, New Jersey: Lawrence Erlbaum Associates. 9 Palaiologou, A. (1999). Inhibition of Delay as a Function of Acquisition Trials and Conditioning Paradigm of the Branchial Defense Reflex (BDR) in the Goldfish (Carasius Auratus). Final Year Undergraduate Project for the BA in Psychology. Prokasy, W. F. (1987). A Perspective on the Acquisition of Skeletal Responses Employing the Pavlovian Paradigm. In Gormezano, I., Prokasy, W. F, & R. F. Thompson (Eds.), Classical Conditioning III (pp. 287- 318). Hillsdale, New Jersey: Lawrence Erlbaum Associates. Rescoral, R. A., & Wagner, R. A. (1972). A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and nonreinforcement. In Black., A. & Prokasy, W. F. (Eds.), Classical Conditioning II: Current Theory and Research. New York: Appleton- Century- Crofts. Thompson, R. F. (1986). The Neurobiology of Learning and Memory. Science, 233, 941- 947. Thompson, R. F., & Spencer, W. A. (1966). Habituation: A model Phenomenon for the Neural Substrates of Behavior. Psychological Review, 173, 16- 43. Thompson, R. F., Groves, P., Teyler, T., & Roemer, R. (1973). A Dual-Process Theory of Habituation. In H. Peeke & M. Hertz (Eds.), Habituation I: Behavioral Studies. New York: Academic Press. 10 11