Laboratory 1

advertisement
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Laboratory 1: Generating random inputs and evaluating some
of their statistics
Goals
1. To become familiarized with the MatLab environment, including how to execute simple
functions;
2. To learn about generating random numbers, seeding the random number generator, and
controlling random numbers with respect to mean and variance;
3. To introduce elementary statistical ideas about sample size and convergence of a sample
average; and
4. To learn how to produce normalized vectors that will serve as inputs to our neural networks
in the future.
Note: You are not required to do a formal write-up for this laboratory, so don’t worry about
including an introduction, methods, and discussion section. Your grade will be based on your
responses to the exercise questions and on the quality of your figures and figure legends. This
means your figures should be appropriately labeled and you should give a clear explanation of
the significance of each figure. See the In-Class Introduction to MatLab for tips on how to label
your figures. Be sure to include the MatLab commands that you used in each exercise in an
appendix at the end of your laboratory.
Introduction
Averages are useful to scientists, sports aficionados, politicians, and just about anybody else who
wants to do a better job. One way we use an average is to summarize a set of observations and to
represent this set. Averages can also be used to generate predictions of future events. Because the
brain generates predictions, it may be the case that individual neurons generate predictions. If so,
we might suspect that individual neurons base their predictions on some kind of averaging
process.
When a process generates events that are not perfectly predictable, we often use statistics to
summarize these events and then to predict future events. The most common statistic is the mean.
A mean can be calculated for anything that can be measured or counted.
Consider two simple examples of measured events. The weight of a first-year university student
is a variable that can be measured. If we calculate the mean weight of 1000 students, we can use
this value to predict the average weight of a different sample of 1000 first-year students. Indeed,
we can use this average to predict the weight of a single student. As a neural example, we can
estimate the number of neurons in cerebral cortex of a mouse. This number differs among
individual mice no matter how homogeneous their breeding. Despite this variability among
individuals, we can calculate the mean number of neurons in one sample of mice and use this
Generating random inputs and evaluating statistics
1
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
average to predict the number of neurons in a mouse that is not part of this sample. Although this
prediction will not be perfect, we are likely to guess closer to the correct value than if we were to
guess without knowing this average value.
Events can also be averaged in terms of their frequency over time. For example, we can’t
measure a person walking into the student union in the same sense that we can quantify that
person’s weight. But we can count the number of people walking into the union between 3 and 4
PM. Thus, if 126 people come into the union over a 60-minute period, then we can conclude that,
on average, 2.1 people enter every minute. Such averages are also useful in neuroscience—for
example, we can estimate the rate at which neurons fire action potentials, or spikes, by counting
the number of spikes that occur during a particular time interval.
Simple Averages
In describing averages, we must distinguish, in a simple way, between the population mean and
the sample mean. Both values can be called means—we can think of the population mean as the
“true mean” or the “predicted mean” and the sample mean as the "empirically observed mean."
The population mean of MatLab’s1 random number generator (rand(), which returns a value on
the interval [0,1)) approaches the value 0.5. However, because of the limited decimal precision of
the underlying computational hardware and because the sample mean of a random variable is
itself a random variable, this mean never reaches 0.5. As you will see below, the sample mean of
the random number generator will almost certainly differ from the population mean. The rate at
which a sample mean approaches the underlying population mean (assuming this theoretical
value is finite and exists in a stationary form) is of interest if we are going to use a sample mean
for prediction.
When it comes to predicting future events, averages have optimal predictive properties. Because
neurons are in the business of prediction, 1) we can hypothesize that synapses construct some
kind of average as they incorporate information from the neuronal activity they experience, and
2) even if this is not the case, we can at least compare the behavior of a biological neuronal
system to a model, computational system that does construct such averages.
Finally, empirical averages are single values that summarize many observed events from the past.
In this sense, these averages are a form of memory, and they are an example—perhaps the
simplest example—of abstraction. Memory and abstraction are surely cognitive processes worthy
of study and understanding.
Convergence in Mean
1
Convergence starts out as a simple mathematical idea. For example, x converges to 0 as x grows
larger. Mathematicians have taken this idea of convergence and applied it in a special way to a
random variable. For random variables and functions of such variables, the only really sensible
and interesting thing we can say about them concerns how certain computations converge. What
1Refer
to ‘In-Class Introduction to MatLab’ for an overview of these commands.
Generating random inputs and evaluating statistics
2
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
is most interesting to us is the convergence in mean or the mean of a function. In fact, this is
probably the most we will ever want to say about many random variables.
An important parameter that controls convergence of a function of a random variable is sample
size. For example, if a stationary random process generates a value of variable x with uniform
probability on the interval [0,1), then the mean value of x should converge to 0.5 if there are
enough samples from this process.2
Section 1.1: Generating random numbers
Neural network modelers need more than just a model of the brain—they also need a model of
the world that provides inputs to the brain. It is part of the art (and science) of attempting to
understand the biological basis of thought by quantitative modeling. To be successful, such
modelers need to recreate the paradigmatic essence of the problems confronting living organisms
on a day-to-day, moment-to-moment basis. In other words, here we’ll use computer simulations
to understand how neural computations can yield cognition. To perform such simulations, we
will need inputs (what we call input environments) to present to our neural-like systems.
It might seem desirable to use real world inputs, but it is not always feasible to do so. Using real
world inputs would require networks that are too large and too complex for beginners (and often
too complex for any research lab in the world). Instead, we will create artificial worlds that are,
in some specific aspects, prototypical of the real world problems that humans actually solve. A
probability model is often the preferred way to create an artificial world. A probability model
assumes that the real world is based on random variables with one or more underlying probability
generators. A good analogy is the probability generator of MatLab—this generic generator is
governed by a deterministic but chaotic process. Interestingly, chaotic processes might also be
the basis of randomness in the real world.
Let’s use MatLab’s random number generator to create a three-dimensional column vector that
can be used as an input to a network. Enter the command
>>VectorA=rand(3,1)
This command generates an ordered list of three pseudo-random numbers. This ordered list, or
vector, is in a special order with three rows and one column corresponding to the (3,1) of the
rand() command3.
In addition to generating a column vector of three random numbers, the rand() command also
assigned this vector to the variable named VectorA. Enter VectorA to convince yourself of this
fact—i.e., ask MatLab what, if any, values are currently assigned to VectorA by typing the
following:
As you’ll see in Laboratory 3, convergence in mean can actually move away from the asymptotic value so that, from sample to
sample, convergence is not uniform. (The rate of approach to the mean is also known as the rate of convergence). When this
happens, it is helpful to know, how much we should trust our sample mean as an approximation of the population mean as a
sample size increases or decreases.
3 You should memorize this row-by-column notation—it is the standard of linear algebra and is important for taking full
advantage of MatLab’s capabilities. If your notation is not consistent, you will get errors when you try to implement commands
that require the proper dimensions of input vectors and matrices.
2
Generating random inputs and evaluating statistics
3
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
>>VectorA
© W. B. Levy 2004
%VectorA will contain this set of numbers until you
%clear VectorA, set VectorA equal to something else,
%or exit MatLab
MatLab will print the three-dimensional column vector of random numbers to the screen.
Then you can clear the vector and make sure that it is empty by typing:
>>clear VectorA
>>VectorA
%unassign VectorA
%VectorA no longer implies any set of values
Now MatLab will print nothing to the screen since VectorA is empty.
There are many different generating processes that depend on the probability distribution used to
generate random numbers. MatLab’s rand() command picks random numbers lying between
zero and one with a uniform (i.e., the same) probability for picking any particular number. You
can, however, alter what this generator produces. You will learn how to shift and rescale the
random number generator in section 1.3.
Convergence of Averages
If an infinite number of samples could be chosen, the mean of these random numbers should be
1
just less than 0.5, and the variance should be 0.0833 (i.e., 12 ). Let’s see what happens with finite
sampling.
Exercise 1.1.1.
Generate sets of 10, 100, 1,000 and 10,000 random numbers and calculate the mean,
variance, and standard deviation of each set.
>>VectorB = rand(10,1); %generate a column vector of 10 random numbers
%and suppress output; set VectorB equal to this
%set
>>mean(VectorB)
%generate the mean value of the elements in the
%column vector named VectorB—this is the sample
%mean of the elements in the vector
>>var(VectorB)
%generate the variance of the elements in
%VectorB
>>std(VectorB)
%generate the standard deviation of the elements
%in the column vector named VectorB
Now generate 100 random numbers and find their mean and standard deviation. Do this
five times. What are the mean and standard deviation of these five means? Do the same
for five sets of 1,000 random numbers, and then for 10,000. Comment on how close the
means of each sample come to what you would predict the mean to be as a function of
sample size.
Exercise 1.1.2.
Generating random inputs and evaluating statistics
4
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Now generate 1,000 random numbers and calculate the mean of the sample using a ‘for’ loop
instead of MatLab’s mean command. A 'for' loop is a programming construct which performs the
instructions it contains for a specified number of times. In other words, update the mean as you
iterate through the elements in the vector, starting at element 1 and ending at element 1,000. Plot
the mean as a function of sample number and comment on how the mean changes you iterate
through the sample.
>>VectorC = rand(1000,1);
%generate a column vector of 1000 random numbers
%and suppress output; set VectorC equal to this
%set
%initialize the current average to zero
>>CurrentAve=0;
>>for I=1:1000
CurrentAve=(CurrentAve*(I-1)+VectorC(I,1))/I;
%update CurrentAve by
StoredAve(I,1)=CurrentAve;
%adding the value of CurrentAve at the
end
%previous timestep to the Ith value
%of VectorC and dividing by I, the sample
%number
%StoredAve contains a history of
CurrentAve
>>plot(StoredAve)
%the plot command is placed outside
%the loop to speed the program
The tiny 'program' above need not be beautifully formated when you enter it into MatLab,
but formatting programs with indentation improves a program's readability and will help
you to find problems with longer programs.
Check that this program is actually working. Use the command
>>mean(VectorC(1:I,1)
to calculate the mean of the first N values stored as part of the vector named VectorC.
Explain why the averaging algorithm works. (Hint: Write the formula for an average as
n
you have been taught
xi
n
i 1
and show equality to the algorithm).
It’s always important to check the quality of any random number generator because none of those
available is truly random. Uniformity means that the generator supposedly produces values from
[0,1) with equal probability for each value. A proper statistical check is beyond the scope of this
course, but there is one way to take a quick look.
A histogram is a convenient way to look at the distribution of many scalar, random numbers. A
histogram is a bar graph which has been automatically constructed by a numerical categorization
rule called 'binning'. For example, suppose are a set of 1000 elements where the values of the
elements range from 1 to 100. MatLab's default number of bins for the graph is 10, that is, there
are 10 categories or 'bins'. The number elements in each bin is plotted as an individual bar.
MatLab's hist command will count the number of elements which fit into each bin and then
display these counts as a bar graph. Ordinarily, the bins are evenly spaced (range of bin = range
of data/number of values in a bin), but this does not have to be true.
MatLab easily generates histograms using the following code:
Generating random inputs and evaluating statistics
5
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
>>rand(‘seed’, 9771877)
>>VectorZ = rand(10,1);
>>subplot(1,2,1),hist(VectorZ)
>>subplot(1,2,2),hist(VectorZ,8)
© W. B. Levy 2004
%seed random number generator
%VectorZ is a column vector of 10
%random numbers; screen printing is
%suppressed by the semicolon
%this subplot generates the 1st of two
%absolute frequency histograms—here a
%histogram of the 10 values using the
%default bin width of 10, and each bin
%has a range of 0.1
%this subplot generates the 2nd of two
%absolute frequency histograms, this one
%has a 8 bins
Fig. 1.1 illustrates the plots created by the code above. These types of histograms are called
absolute frequency histograms, but they can be converted into relative frequency histograms by
dividing the individual bin counts by the total of all bin counts. With enough (infinite) samples, a
relative frequency plot reproduces the underlying probability distribution that was used to
generate the random numbers. MatLab easily generates relative frequency histograms as the
following code illustrates (see Fig. 1.2).
In the next set of commands, you use a command called 'figure.' Figure opens a blank figure
window and is important because it will allow you to open several figures at once. Without
opening another figure, MatlLab would overwrite your previous figure.
>>figure
%start the second figure (Fig. 1.2 below)
>>[BinCounts,BinLabels] = hist(VectorZ)
%BinCounts is assigned the number
%of elements in each bin in the
%histogram and BinLabels is
%assigned the center of each bin in
%the histogram
>>RelativeFreq = BinCounts/sum(BinCounts) %RelativeFreq is BinCounts
%by the division of BinCounts
by
%its sum
>>bar(BinLabels,RelativeFreq)
%creates a relative frequency histogram
%from the values in RelativeFreq using the
Generating random inputs and evaluating statistics
6
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
%bar command
Fig. 1.2. Relative frequency
histogram (bar graph) of the data
plotted
Fig. 1.1A.
Fig 1.2 uses
Fig.
1.1. inAbsolute
frequency
the information
1.1A to create a
histograms
of 10 in
uniformly
probabilityrandom
distribution
in the form
distributed,
numbers.
of
a
relative
frequency
histogram.
The absolute number of random
As the number
of generated
values
numbers
generated
is plotted in
increases,
a
relative
frequency
each bin. In A, there are 10
histogram
bins,
each ofwill
sizeapproach
0.1. Bina uniform
random isdistribution
0.05-0.6
empty because no
random numbers were
generated in this range. In B,
the same 10 random numbers
are plotted as a histogram using
only 8 bins. You can see how
changing bin width alters the
absolute frequency histogram.
E
x
e
r
c
i
s
e
1
.
1
.
3
.
Compare the relative smoothness of the histograms and relative frequency plots when 10,
100, 1000, and 10,000 random numbers are used. How does bin width affect this
smoothness? Be sure to consider absolute values of bin counts as well as relative values.
Section 1.2: Seeding the random number generator
Suppose that, as an afterthought, you want to determine the mean and variance of a set of random
numbers. Each set of values must be called by a unique name. If you used the variable VectorZ
to store three different vectors (with elements numbering 10, 100, and 1000) in the last exercise,
then the first two vectors (i.e., those with 10 and 100 elements) are lost forever and you cannot
recreate them. This lack of reproducibility can be a problem if you want to compare simulations
of two different models given the same sequence of inputs, especially if you forget to save the
inputs. Fortunately, there is a standard procedure for avoiding this problem.
Because computer-generated random numbers are, properly speaking, pseudo-random numbers,
we can generate the identical sequence whenever we want so long as we start the generator the
Generating random inputs and evaluating statistics
7
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
same way every time. The initial value of the generator is controlled by a seed value using the
command
>>rand(‘seed’, 1204513528)
%1204513528 is an example of a seed value
Try the following sequence of commands:
>>VectorX = rand(4,1)
>>VectorX = rand(4,1)
>>rand(‘seed’,16)
>>VectorX = rand(4,1)
>>rand(‘seed’,16)
>>VectorX = rand(4,1)
%generating a column vector of 4 random numbers twice
%generates two different sets of random numbers
%because the random number generator is not started
%the same way both times
%try it again; this time seed the generator with the
%same number both times
%here we happen to use 16 as the seed value
%now the same set of random numbers is generated twice
%because the random number generator was started the
%same twice
Important Tip: Remember from the in-class introduction to MatLab that it is important to seed
the random number generator at the beginning of each lab and more often if desired. Your seed
must be unique—use a value like your social security number or a simple perturbation of it. In
any case, do not lose your seed values—without them, you can’t reproduce your results, and a
non-reproducible result is not useful.
Section 1.3: Relocating and Rescaling the Random Number Generator
The default range for the command rand() is [0,1). You will often need a sample of random
numbers with a different range. By adding or subtracting a constant, you can move the range, and
by multiplying or dividing, you can change the width of the range. The following exercises will
teach you how to move and rescale the random number generator.
Exercise 1.3.1.
Here you will generate random numbers on intervals that are different from the [0,1)
MatLab default. First, generate nine random numbers on the interval [–.5, +.5).
>>rand(9,1)-.5
%subtract 0.5 from each random number to relocate the
%default interval
Basically, what you are doing here is generating 9 random numbers between 0 and 1 and
then subtracting .5 from each one to shift the scale.
Now generate five random numbers on the interval [0, .5).
>>(rand(5,1))/2
%contract the interval by dividing by 2; this also
%relocates the interval
Here, you are generating 5 random numbers between 0 and 1 and then dividing each one
by 2 to reduce the range of the scale to [0, .5]. How do the mean and variance of the 5
Generating random inputs and evaluating statistics
8
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
elements in this sample in this range differ from those obtained with the default range of
[0,1)?
Now compare the estimated mean and variance for the two different ranges sampled
above. For a uniform distribution with range [a,b) and an infinite number of samples, the
mean is (a+b)/2 and the variance is (b–a)2/12). Your observations will be a little different
than you might expect because you are using a finite number of samples.
Exercise 1.3.2.
1. Create four different data sets using the uniform distribution and samples of 4, 16, 256,
and 8192 random numbers over the interval [–1,1). (Hint: first expand the interval to the
desired width and then shift its location).
2. Plot the results as histograms and comment on how well the results approximate the
uniform distribution.
3. Calculate the means and variances for each set and compare these values to the
theoretical ones. As the number of samples increases, how do your results compare to any
observed or expected trend?
4. Comment on the rate of convergence as a function of sample size. Plot the means and
variances as the sample size increments. Do these measures converge as sample size
increases? (Note: the subplot command will make for a more compact set of plots).
Hint – You will need to create a vectors to record the mean and variance for different
sample sizes. This is much like recording the history of the average, and you could use a
'for' loop to help.
Exercise 1.3.3.
Compare the variances, means, and histograms for sample sizes of 10 and 1,000 random
numbers. Use three different functions of the rand() command:
1. rand()
2. 2*rand()
3. 1+rand()
Plot the results as histograms and rank order the means and variances by size. Explain
how the distribution of the histograms reflects the rank orders. Also, plot the variances as
a function of sample size. How does the variance change as sample size increases? If you
could sample long enough, to what value would each mean converge? Discuss the
relationship between number of samples and convergence.
Optional: The command rand+rand produces an interesting result—compare it to the
2*rand command.
Generating random inputs and evaluating statistics
9
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Section 1.4: Normalization
Subsequent laboratories will not work properly unless you normalize input vectors and synaptic
weight vectors. Normalization is a technique used to standardize the length of a vector so that all
vectors are of the same length. The idea of normalization also has some biological merit.
A condition typically enforced on the inputs to a neural network is normalization of a
multidimensional input or, in more realistic models, approximate normalization. Normalization
makes pattern recognition much easier. The idea is that the absolute strength of a signal often
says nothing about its meaning—it’s like playing the radio at two different volumes. The songs,
weather, and news reports are the same, but the signal is louder in one case. Similarly, if we see
an object in dim light or in bright light, we still see the same object, and we can scale for the
amount of illumination. Indeed, there are neurons sensitive to global light levels (and even to
global contrast) that make such scaling not only sensible but also feasible for the nervous system.
We will use Euclidean normalization for our purposes in this class. We call it Euclidean because
we use Euclidean geometry to define the measure of length. The Euclidean distance is defined as
. For a right triangle* a b c , the length of c is a2  b2 . For a two x1 
dimensional vector X   x  , the distance of this point from the origin is the Euclidean length
 2
X-Y
T
X-Y
of the vector,
2
2
2
x12  x22 . Normalization makes all vector lengths equal to one. Thus,
x1


 2
2 
X
 x1  x2 

 is certain to be length one. The idea easily generalizes to any number of
2
2
x1  x2  x2

 x12  x22 
 x1 
X
 
dimensions. For example, in three dimensions, X   x2  and normalized it is
2
2
2 .
x1  x2  x3
 x3 
You will recognize the denominator if you look carefully at the definition of Euclidean distance
X-Y T X-Y . Consider the multiplication of a vector by its transpose:
 x1 
 
x T x = x 1 x 2 x 3   x 2   x 1 * x 1  x 2 * x 2  x 3 * x 3  x 12  x 22  x 32
 x3 
This is exactly what you need in the denominator when you normalize your vector.
To visualize this idea, we consider normalization to the all-positive orthant of the hypersphere
(equivalent to quadrant one in two dimensions). A circle is a two-dimensional hypersphere, and a
*
It is conventional to describe the position of a point relative to axes at right angles to each other.
Generating random inputs and evaluating statistics
10
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
sphere is a three-dimensional hypersphere. In general, hyperspheres are in higher dimensions
than we can depict pictorially. Nonetheless, they exist as geometric objects, and they are very
important in neural network theory.
So what does normalization to the all-positive orthant of the hypersphere mean in two
dimensions? In this case, the data points lie on the positive quadrant of the unit circle centered at
the origin (see Fig. 1.3). To make this happen, you need a vector of non-negative elements and
the sum of the squares of the elements of the vector must equal one.
Fig. 1.3. A two-dimensional hypersphere,
or circle. The outlined quadrant is quadrant
one, or the all-positive orthant of the
hypersphere.
The remainder of this lab illustrates two ways to create input sets that are normalized.
To see how to normalize to the unit circle, we first need to generate and display 50 random, 2dimensional data points. The following lines of code do just that (see also Fig. 1.4).
>>dimension1 = rand(50,1);%generate 50 random numbers on [0,1) and %do not
print
>>dimension2 = rand(50,1);
%generate 50 random numbers on [0,1) and
%do not print
>>plot(dimension1,dimension2,'o')
%plot unconnected 2-D points as o’s as
%shown in Fig. 1.4
>>axis(‘square’)
>>axis([0,1,0,1])
Now we’ll normalize the set of 2-dimensional data points that was just generated. Remember that
normalization can occur for any dimension. One way to perform this normalization is as follows.
For the unit circle,
(x12 + x22)=1,
Generating random inputs and evaluating statistics
11
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
where x1 and x2 are the coordinates on the abscissa and ordinate for each pair of data points,
respectively. We can use these two equations, as the bit of MatLab code below shows, to
calculate two new vectors which, when plotted as pairs, fall on the unit circle (see Fig. 1.5). In
other words, this bit of code uses MatLab’s period operator to help normalize the 50 pairs of data
points to the unit circle.
Example: Normalizing vectors
Here is an example where we generate a single three-dimensional input and normalize the points
to the unit circle:
>>x = rand(3,1)
>>ssx2 = x’*x
>>normalizedx = x/sqrt(ssx2)
>>normalizedx’normalizedx
%generate 3 random numbers as a column vector
%multiply x by its transpose to get the inner
%product, which is a scalar
%normalize the original vector by dividing each
%of the original random numbers by this scalar
%check to see if the new values are on the unit
%circle
Now we can try plotting a normalized vector so that we know how to visualize such data.
>>dimension1=rand(100,1);
%create each dimensions of 200 random numbers
>>dimension2=rand(100,1);
%in a column vector
>>figure
%start the first figure
>>plot(dimension1, dimension2, ‘o’)
%plots unconnected pairs of 2-D points as o’s
>>axis(‘square’)
%forces the axes to plot at the same scale
>>axis([0,1.1,0,1.1])
%forces the plot to include the origin &
%the value 1.1
>>figure
%start the second figure
>>NormalizationValue=sqrt(dimension1.^2+dimension2.^2);
%this function is the sum of the squares
%calculated individually for each column vector;
%the period signifies an element-wise operation
%that is not part of linear algebra
%(see also >>help punct)
>>plot(dimension1./NormalizationValue,dimension2./NormalizationValue,‘.’)
%plots unconnected pairs of 2-D points as dot’s
>>axis(‘square’)
%forces the axes to plot at the same scale
>>axis([0,1.1,0,1.1])
%forces the plot to include the origin &
%the value 1.1
The above code will generate plots similar to those in Figures 1.4 and 1.5.
Generating random inputs and evaluating statistics
12
Fig. 1.4. Scatterplot of the pairs of 50
random numbers before normalization to the
unit circle. Note how the points are
scattered all over the graph. The vector
named dimension1 consists of n uniformly
distributed random values dimension1(1),
dimension1(2), ... , dimension1(n). Similarly
the vector dimension2 also has n uniformly
distributed random values, where n = 50.
The data points plotted here are pairs of
values (dimension1(i),
dimension2(i)),
Laboratory
1-Spring 2005
where 1 i  n; e.g. the first point is
(dimension1(1),dimension2(1)).
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Fig. 1.5. The same 50 data points are
normalized to the unit circle. The normalized
vector, Dimension1, is called Normalized
Dimension1; the normalized vector,
Dimension2, is called Normalized Dimension2.
Note the distribution of the data points along the
arc of the unit circle.
Exercise 1.4.1.
Create two dimensions of 200 random vectors each and normalize them to the unit circle
First take these same data and plot the sequential values of each dimension. Then plot
them one against the other. Be sure to square the axes of your plot.
Note: The idea of normalization is important enough that you might be quizzed on this at the
beginning of the next lab. Most importantly, remember that normalization makes all vector
lengths equal to one.
Section 1.5: Generating uniformly random points on an arc of a circle
This section will help you learn how to generate the input sets that will be used in the next
laboratory. As you may have noticed, it might be difficult to generate points with uniform
probability on an arc smaller than 90 (/2 radians). Generating points directly on the circle
makes it easy. The key is to think in terms of the circumference of the unit circle. By definition, a
unit circle has a radius of one.
Generating random inputs and evaluating statistics
13
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Calling radius r, the circumference of any circle is 2r. For the unit circle, the circumference is
2. This simple idea is used to generate points in a uniform way on the unit circle or an arc of
that circle. If we think of the entire 2 circumference of a circle as an arc of 360, then we can
translate any angle to an arc length. The angle of interest has its vertex at the origin. (Note that
arc lengths are in units called radians). The arc length of a specified angle can be calculated as:
arc length of angle x
x degrees
360 degrees
2 .
The arc length from 0 to 90 is ¼ of the circle and equals (90/360)*2 or /2 radians.
Suppose we want to generate 100 uniform random values over an arc of 90. (Recall that
90=/2 radians). Then
>>RandPoints=rand(1,100)*pi/2;
%generate 100 random points from 0
%to /2 and suppress printing; here
%we use the MatLab command, pi
The points are now evenly distributed over the desired range.
To plot these points requires trigonometry in order to convert a position on the unit circle
to a point specified in rectangular coordinates. The following bit of MatLab code will plot
the points and it is another way to normalize to the unit circle.
>>dimx1=cos(RandPoints);
>>dimx2=sin(RandPoints);
>>plot(dimx1, dimx2, ‘.’)
>>axis(‘square’)
>>axis([0,1.1,0,1.1])
For a right triangle, remember that sin() equals the side opposite divided by the
hypotenuse and that cos() equals the side adjacent divided by the hypotenuse.
sin
cos
Generating random inputs and evaluating statistics
x2
14
1
side opposite x 2
x2
hypotenuse
1
side adjacent x 1
x1
hypotenuse
1
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Thus the cosine and sine of a point on an arc locate its position in terms of the abscissa
and ordinate.
Because MatLab requires that we use radians as the units, no further conversion is
necessary given that the random points are in radians.
Exercise 1.5.1.
1. Generate 100 points that are very uniformly distributed on the all-positive portion of
the unit circle.
2. Generate points uniformly for the arc that goes from 30 to 60 (/6 to /3 radians).
Recall the scaling and shifting technique introduced in Section 1.3.
Section 1.6. Possible Quiz Questions





How would you use a ‘for’ loop in MatLab to update the current average as you iterate
through the elements in a vector?
How does a relative frequency histogram differ from an absolute frequency histogram?
Why is it important that you seed the random number generator?
How would you shift the range in which random numbers are generated in MatLab from the
default [0,1) to [-1,-.5]?
Explain why it is important to normalize your vectors before using them as inputs to your
network.
Appendix 1.1. Programming technique and debugging4
Let's put things into perspective. Suppose you are building a house and a sloppy contractor will
pour your foundation for $10,000 but it will probably be crooked. On the other hand, another,
very careful foundation contractor who works a little slower want to charge you $20,000, but you
will get a square and level foundation. To building the rest of the house on the square, level
foundation cost $100,000 but to build the house properly on the crooked foundation will cost you
$130,000. Now which contractor do you choose?
When it comes to programming, your time is your cost. It is almost a certainty, especially for
larger programs (and a large program is a relative thing), that you will spend more time
debugging than the initial time spent programming. Thus, it makes sense to spend a little extra
time writing the program from the start because it can easily save you a lot more time at the
debugging stage.
Debugging is often the hardest part of writing a complex program. Small typos, misspelled
variables, improper use of functions, and in fact just about anything can cause a well-conceived
program to run incorrectly. The trick is to find the errors quickly and get the program working
with a minimum amount of rewriting. There are many ways to lessen the number of errors in
4
Thanks to Jason Meyers and Andrew Perez-Lopez for contributions to this appendix.
Generating random inputs and evaluating statistics
15
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
your program code, and there are several ways to trace the errors once the code is written. Here
we discuss both.
Here are some ways to avoid making errors when you are your writing code:

Make your code legible. Ideally, your code should as close to regular English as
possible. For example, suppose you want to write a nested for loop to print each
number in a weight matrix. The following code will work:
for i=1:nr
for j=1:nc
x(i,j)
end
end
but who can understand what this code fragment is doing? The following will be
much more useful in the long run and lead to easier degugging.
% go through each value in the weight matrix and print it out
for rowIndex = 1:maxnumberOfRows
for columnIndex = 1:maxnumberOfColumns
weightMatrix(rowIndex,columnIndex)
end
end


If the first set of code were positioned in the middle of a page, perhaps nested in
another loop, it would be extremely confusing, while the second set of code would be
clear. Using longer variable names can be taxing on the fingers, but with the copy,
cut, and paste functions nedit provides, it’s not so bad, and it makes much less work
for your brain in the long run.
Write the code with lots of comments. After each line, write what any new variable is
to be used for, and what the calculation is going to do. This will make it much easier
to debug later and those who read your code will be able to understand what your
program is doing. Far and away the hardest thing to do when you are trying to read
code that you’ve written, or especially code written by someone else, is to figure out
what the purpose of a certain process or variable is.
Use descriptive variable names. Choose variable names that describe what they
represent. For example, if you have a variable that represents the number of neurons
in your network, call it number_of_neurons instead of something generic, like x.
Descriptive variable names are crucial to the process of writing any code more than
five or ten lines long
A note about variable declaration: MatLab allows what computer science jargon
calls ‘implicit’ declaration of variables. Basically what this means is that you do
not have to tell MatLab when you want to make a new variable—you can just start
using it. Not all programming languages share this feature. While implicit
declaration of variables is sometimes very easy to understand, with long code it
can sometimes complicate debugging. If you see a new variable when trying to
trace through your code to find a bug, you often try to figure out what the variable
means. It may be that the variable had no meaning until this particular line of
Generating random inputs and evaluating statistics
16
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation





© W. B. Levy 2004
code. This make it tough to figure out what code does. Thus, it again pays to use
descriptive names for variables.
Make your code more ‘robust.’ If you are writing a program that will be reused with
different parameters (such as numbers of neurons or percentages of connectivity),
write the original program with variables like number_of_neurons and
number_of_synapses, in place of what are called ‘magic numbers,’ like constants
256 or 26. If you do this, then you can just copy and paste whole blocks of code,
changing only those lines that define the number of neurons and the number of
synapses. Many errors come from copying and pasting lines and then missing a few
lines when making the changes to a new network.
Write your code in small, reusable chunks. If you write your code piece-by-piece,
it will be easier to locate errors and, once you have debugged these pieces, you can
reuse them in future programs. For example, you can write a generic piece of code to
calculate the inner excitation of a neuron (you might even create a function to do
this). Once it is debugged, it can be used for all networks that need an inner excitation
product calculated.
Do not write your program all at once and then try to run it. Run the pieces as
you write them. Try each function separately. Check the output at every step. Is it
doing what you thought? Do the inputs and outputs make sense? Are they the size you
would expect? It is much easier to debug small bits of code than an entire network.
Test subroutines using extreme values. If a variable can vary between 0 and 1, test
at the extreme of 0 and then the other extreme of 1. Then put the subroutines together
and rerun the same test values.
Use the clear command. If you frequently use some variables (e.g., input(x) or
output(y)) the current implementation of the network can fail if a previous network
had a different size input or output, or if the previous network left lots of data still in
the matrices. To eliminate this problem, just include clear variable_name in your
code. This command will erase all data in the stated variables so that the next running
of the network can start fresh.
Along with this, it is good to include a clear all statement at the very beginning of
a long program, so that all of the variables are cleared off, the memory is wiped clean,
and the program has a fresh slate. If you don’t have to carry any variables over
between network models, you can clear all between each network.
the absence of proper clear statements, outputs from previous runs can mess up
current runs. Here’s a suggestion that may just kill two birds with one stone. At the
beginning of your code, write a clear statement for each variable you are using and
also state what the variable does. For example, the code used above might now look
like this:
In
clear rowIndex
%loop variable to specify the current row
clear numberOfRows
%the number of rows in my weight matrix
clear columnIndex
%loop variable to specify the current
%column
clear numberOfColumns
%the number of columns in my weight
%matrix
clear weightMatrix
%a matrix containing all the weights for
%my network
Generating random inputs and evaluating statistics
17
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
%………………………………………………………………………………
%the rest of your code goes in between the part above & the loop
%below
%………………………………………………………………………………
for rowIndex = 1:numberOfRows
for columnIndex = 1:numberOfColumns
weightMatrix(rowIndex,columnIndex)
end
end

This method makes sure that all your variables are cleared out, provides a central
location where you can find out what all your variables mean. You might be thinking,
“I know what my own variables mean,” but when the code gets long, and you come
back to code written weeks before, it gets tricky! Doing this generally makes your
code more readable to yourself and any others who may feel inclined to read it.
For small debugging runs, leave out the end of line semicolons. When trying some
new code for the first time, it’s really useful to see how the data are changing. If you
leave out the ; at the end of each line, Matlab can display the variables as they are
changing. This is not very useful for 100x100 matrices, but it is really good for seeing
how counters, excitations, etc. are changing.
Once the program is written, you think you have finally solved the tough problems, mapped out a
solution, and got it all coded in, but then you run the network and it grinds to a halt. Ack! What
now?



If the program stops running because of an error, look at the error message and at
the line in the program where the error is. Do you have the number of inputs required
by the called function? Check help function_name to see what the function needs.
Are all of the variables the size you think they will be? When Matlab stops for an
error, all of the variables are still stuck in memory (this is another good reason to
include clear statements every once in a while). Type the variable names and see if
they have the data that you thought they would. Make sure your vectors and matrices
are the right size. Many an error comes from forgetting whether a matrix has 26 rows
and 256 columns or 256 rows and 26 columns! If you have comments in the code
telling you what the matrix size should be, then it’s easy to make sure it’s the
expected size.
Alas, sometimes the real error in a program is many lines before Matlab detects it.
For this problem, it is useful to use a debugger, which will stop the program where
you want it to stop and let you run through line-by-line. Matlab includes a debugger—
here are some of its basic commands (use help debug for more information):
>>dbtype mfile
or
>>dbtype mfile start:end
Lists the specified m-file with line numbers for debugging (more on can
be a useful way to look through the code). You need to know the numbers
of the lines in your m-file code where you want to start and stop. Look
Generating random inputs and evaluating statistics
18
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
through the program and write down the first line numbers of difficult
sections that you think may be causing the errors. For example, if it’s a for
loop, pick the line number of the for command.
>>dbstop in mfile at lineno
Places a breakpoint at the specified line number in the specified m-file.
When you execute the m-file, Matlab will temporarily stop when it gets to
the specified line and give the prompt:
K>>
This prompt means that Matlab is waiting to continue the program. At this
point, you can inspect any variables by typing their names, or you can
change the values of any variables.
To restart the program, there are two commands:
>>dbcont
Continues running the program until it reaches the next breakpoint.
>>dbstep
Runs the next line of the program. This is very useful for watching how a
variable is changed over a couple of lines of the program.
>>dbclear all
or
>>dbclear in mfile at lineno
Clears all breakpoints, or clears the specified breakpoint.
>>dbquit
Stops the program from running and returns Matlab to its normal state. Program is
not completed and nothing is returned.
Some other useful commands for debugging:
>>keyboard
When placed in an m-file, this command stops the m-file and gives the K>>
prompt, but does not allow dbstep and dbcont to be used. To get the program
running again, type return (the text, not the button on the keyboard).
>>who
Generating random inputs and evaluating statistics
19
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
© W. B. Levy 2004
Lists all of the active variables at that time. A good time to use this command
would be when you want to clear all those unneeded variables which might
interfere with your program.
>>whos
Like who, but also gives you the size of each of the variables.
All too common mistakes:
 Typing in function(column,row) rather than function(row,column).
 Forgetting the final parentheses in embedded commands. For example,
test=input(excitation(counter),output(neurons(counter))


needs one more parenthesis at the end.
Leaving data in variables from previous runs of a program. For example, if you
use the variable output(x) for a network with 6 outputs and then for a network
with only 4 outputs, the variable will still have 2 outputs from the old network.
This will mess up later calculations. Include a clear output line in the
beginning of the program.
Trying to assign something of the wrong size to a variable. If a network is giving a
column vector output, make sure that the variable assigned to that output can
accept a column vector. To help with this problem, write all of your programs so
that they use column vectors and just stick with it!
Above all, remember to keep things simple. When first testing ideas, don’t use lots of deeply
embedded loops. Write simply, and you can debug simply. Once you know that your code is
going to work, expand it out into the loops through each neuron, variable, synapse, etc. Hopefully
you’ll take these suggestions and use them to write code that is easily understandable, easy to
debug, and elegant. Good luck!Appendix 1.2. Glossary terms
population mean
the true mean or predicted mean
sample mean
the empirically observed mean
absolute frequency histogram
a histogram that displays individual bin counts for a
set of data (the default bin width in MatLab is 10)
relative frequency histogram
a bar graph that displays the results of dividing each
of the individual bin counts by total of all bin counts
seed value
the multi-digit unique number used to seed the
random number generator in MatLab
Normalization
a technique used to standardize the length of a
vector so that all vectors are of the same length
aka Euclidean normalization
Generating random inputs and evaluating statistics
20
Laboratory 1-Spring 2005
Neural Network Models of Cognition and Brain Computation
abscissa
the x-axis
ordinate
the y-axis
Generating random inputs and evaluating statistics
21
© W. B. Levy 2004
Laboratory 1-Spring 2005
Download