Lectures 9 Feed-Forward Neural Networks

advertisement
Lectures 9 Feed-Forward Neural Networks
Learning outcomes
You will be able to:
describe the common feed-forward neural network architecture;
describe various common transfer functions;
know how to code different types of data for input to a network.
We have met the hardlim transfer function. We do however allow other transfer functions
– in fact for multilayer networks we need other transfer functions because hardlim makes
training ineffective.
The most common are
hardlim
hardlims
purelin
logsig (or sigmoid)
tansig
How do we use feed forward neural networks (or the five second
guide to neural network modelling)
Feed forward nets are used to classify patterns, recognize things or to calculate functions.
We get them to do this by supervised training - that is we present examples to them and
say "do this thing when you see something like this other thing". After training (if we get
it right) the network not only knows how to behave with the data we trained it on but also
acts correctly with completely new data. Sometimes this very simple idea gets lost in the
detail – because the things we want the net to recognise don't come ready made to put
into a computer. We have to do some work to get them into the right form.
Description of Feed Forward Neural Network used for function
approximation.
The standard feed forward network that we will use is used to attempt to model a
function.
It has three layers:
input layer
hidden layer
output layer
No transfer function
uses logsig transfer
uses purelin transfer
How many neurons in each layer?
input
hidden
output
determined by function
?????
determined by function (well almost)
choose to get
good fit
eg Credit scoring. Input size depends on data you possess. Output size depends on what
you are after – 1 if credit score, 1 if yes/no, 3 if £x at y% paid back over z months.
An example problem
We have some data on irises – numerical values and the classification of the iris the data
was taken from. Can we tell what kind of iris we are looking at just from the data (and so
do away with the need for botanists to do this job for us)?
See the data in iris.data. [we will make available in the labs]
A sample of this data is
6, 2.7, 5.1, 1.6, Iris-versicolor
6.7, 3.1, 4.7, 1.5, Iris-versicolor
4.3, 3, 1.1, 0.1, Iris-setosa
6, 2.2, 5, 1.5, Iris-virginica
5.8, 2.6, 4,1 .2, Iris-versicolor
The data is interpreted as follows:
6
2.7
5.1
1.6
are the observations of 4 features of the iris.
This iris is of type Iris-versicolor
4.3
3
1.1
0.1
are measurements on an iris of type Iris-setosa.
We need to get this into a form we can classify with a neural net – so we use a numeric
coding. See the data in irisnumeric.data.
A sample of this data is:
6
2.7
5.1
1.6
1
6.7
3.1
4.7
1.5
1
4.3
3
1.1
0.1
2
6
2.2
5
1.5
0
5.8
2.6
4
1.2
1
The data is interpreted as follows:
6
2.7
5.1
1.6
are the observations of 4 features of the iris.
This iris is of type 1 or Iris-versicolor
4.3
3
1.1
0.1
are measurements on an iris of type 2 or Iris-setosa.
We want to train a NN to recognise such data i.e we want to plug in four values for the
features and have the network say "That was an iris-versicolor" (or rather "That was a
number 1" or rather output 1).
So we create a network with four inputs, some number of neurons in the hidden layer (3
say) and one neuron in the output layer and use the default transfer functions.
[picture]
We give it lots of samples to train on, and if it works fine. If not we try altering the
hidden layer size or using other transfer functions.
How to code information into the inputs:
number input
eg weight height salary age data. (Assuming for credit score output). These are probably
left as numerical inputs – one neuron each.
Picture
However – beware of data over time. Example with inflation – don't use raw numbers but
categorise as low, medium, high for example.
Non-Numeric Input
More data: weight height salary age gender data
gender – male/female categorical data – not a number. Convert to numeric: 0 for male say
and 1 for female.
Even more data: ethnic origin - white european, black british, black african, asian sub
continent etc.
Here there is a choice of single neuron 0 0.25 0.5 0.75 1.0 say.
Alternatively use a bit map approach: ethnic origin group of neurons. Code ethnic origin
as a bit pattern. 0 0 0 – white 0 0 1 black british etc.
picture
There are three main types of coding data:
Linear or Local [suitable for numeric or ordered categorical data (eg income on a
categorised scale low income – 1 middle –2 high - 3)]
Binary coding [for categorical where no expected relationship (eg ethnic origin or Iris
data) Only used to reduce output dimension – try to avoid since hidden order]
One-of-n or Distributed [for categorical where no expected relationship (eg ethnic origin
or Iris data) Preferred over binary but may be unwieldy if n is big]
NB – coding can affect how easy it is to find a network which recognises the data.
How to code information into the outputs:
Similar choices for the output neurons. If you want to train a network to recognise
categories you need to code them somehow. There is additional problem of decoding the
output you actually get (applies whatever you do).
Suppose we expect numerical output. Then if neuron spits out 25 ok its 25.
Expect categorical output 0 male 1 female. Neuron spits out 0.4 ?? Well it's nearer to 0 –
so male. [You might want to know what values the nn produced for the training data
before deciding to put the cut off at 0.5].
picture
Similarly with distributed output 0.2 0.3 0.7 – is this really 0 0 1?
The usual rule is to take the winning neuron as a 1.
How many neurons
No definite rule but……there is a "rule of thumb" about the relationship between the
number of weights and the number of training values needed to give a certain level of
performance:
#training set > #Weights/error proportion
eg want 10% errors as max error and design network with 20 weights 20/0.1=200. Need
200 data points in the training set.
Alternatively if you have a network with 15 weights and only 50 data points:
error=#W/#train=15/50 ~ 1/3. So the network won't be very good.
Download