advertisement

Handout to accompany Rumelhart and McClelland: On learning the past tense of english verbs. 1. Structure of the model As you can see from Figure 1, the model is a feedforward neural network consisting of an input and an output layer. Note that the connections between the “phonological representation form” and the “wickelfeature representation form” are hardwired, as are the connections between the “wickelfeature representation of root form” and the “phonological representation of past tense”. Only the connections between the “wickelfeatures of root form” and “wickelfeatures of past tense” are subject to changes (or learning). Thus we will consider only these last connections as part of “the network”. 2. Operation of the model The inputs to the network are binary (0 or 1), and the outputs are “probabilistic”. This means that at each sampling step, the probability that a neuron is active (aj=1) or inactive (aj=0) is a function of the summed input (netj). The figure below shows the logistic function used to calculate the probabilities with different values of the “temperature” parameter. 1.2 1 0.8 T=2 T=1 T=0.5 T=0.2 0.6 0.4 0.2 -5 -3 0 -1 -0.2 1 3 5 Logistic function with different values of T. When T becomes small, the logistic function approaches a step-function (threshold function or heavy-side function) and the output behavior of the units becomes more and more deterministic. 3. Leaning As described during lecture, each example of the training set (a pair of root and past tense wickelfeature representations) is presented at the input of the network. The corresponding output is calculated as a function of the connection weights and the logistic function. The produced output is then compared to the desired output (i.e. the “correct” past tense for a give root form), using the perceptron convergence rule (see Lecture 3). One feature of such a network is that it can extract regularities from a set of input patterns. In a way, the network performs statistics, it finds what’s common among patterns and what’s different.