Neural Networks

advertisement
Intro to Artificial Intelligence
Lecture 7
Neural Networks
Introduction to Neural Networks
Learning outcomes
After this lecture you will:

be able to describe the biological model on which artificial neural nets are based;

be able to model a perceptron and know the limitations of this model

be able to describe the perceptron learning rule

know the significant difference between single layer and multi-layer perceptrons.
Background information – the following information and definitions will allow you to
appreciate more fully what the particular neural nets that we study are doing. We start
with the simplest type of artificial neuron and discover its limitations.
Real world problem 1
Jim wants a loan. He goes to his local bank and asks for one. Is he a good risk?
Well ....the bank has lots of customers and lots of data on those customers. Can the bank
use this to decide about Jim?
Exercise: What kind of data does the bank have?.......What kind of output might we want?
Answer: Output might be simple yes/no or more sophisticated (no more than £3000) or
even £x at y% interest.
Real World Problem 2
We are a large supermarket chain. We wish to have a computer system that recognises a
number on a lorry for automatic assigning to a bay at our warehouse. The number is a
combination of letters and numbers. Light is poor and it could, of course be raining.
How could we do this? What might be the inputs? The outputs?
Intro to Artificial Intelligence
Both of the problems have the following features:

They are actual problems!

The data is often noisy and unreliable

They are difficult problems that can be tackled using conventional
statistical/mathematical techniques. However these methods often
underperform.

We have many inputs with one or more outputs.
Artificial Neural Networks (ANNs) can tackle these types of problems.
Artificial Neural Networks (ANNs) are computer systems that are biologically inspired.
They are based on biological neurons.
The human brain has approximately 1011 neurons with of the order of 1015 connections.
Biological neurons
Each neuron takes electrochemical signals as input and sends new signals via the
connections to other neurons.
The cell fires down the axon.
The cell bodies of many other neurons receive this signal at synapses where they pass
into the dendrites.
The dendrites carry in the input from many other neurons. The input signals are summed.
Intro to Artificial Intelligence
On exceeding a threshold, a signal is sent via the axon to other cells.
The process repeats – the strength of the connections at the synapses change as things are
learned.
One of the applications of neural nets is learning from data. Treat as an engineering tool
(but psychologists might use in other ways).
The Perceptron
The simplest of artificial neurons is the perceptron – it fires if the input goes above some
threshold. This is often referred to as the simple perceptron or the “Rosenblatt
perceptron”.
x1 input
w1
output = hardlim( w1x1+w2x2+b)
w2
b
x2 input
Nb can have more inputs
Example calculations
We first of all sum the input to get S = w1x1+w2x2+b
[We view the weights on the input edges as behaving like the strengths of the synapses.]
Then we fire – but only if the input is big enough
We control this by firing 0 if the input is small and firing 1 if the input is over the
threshold which is a number – T say.
F(S) = 0 if S<T
F(S) = 1 if S>T or S = T
Intro to Artificial Intelligence
We can define such an F via the hardlimit function
hardlimit(x) = 0 if x<0
hardlimit(x) = 1 otherwise.
Then F can be written as hardlimit(S-T).
In matlab we usually call the hardlimit function hardlim.
In general we can have more than one neural unit in a layer.
Set w1 = 0.3 and w2 = -0.2 and b = 0.5. This has fixed our perceptron. Now let us do
some computations with this perceptron:
Suppose x1 = 1 and x2 = 2
Suppose x1 = -1 and x2 = -2
Suppose x1 = -2 and x2 = 4
Matlab makes the calculations easier by using vector (multi-valued) input which allows
shorthand notation like output = hardlim(wx+b). We can also have multi-valued output
by having several perceptrons accepting the input and all calculating independently.
Intro to Artificial Intelligence
Training a perceptron
The idea is to train the perceptron to recognise some particular values of input [by firing
a 1] and reject [by firing a 0] other types of input. What we want to do is find the w and
b values which will make a perceptron which does what we want. We use supervised
learning and the adapt method with the perceptron learning rule.
We have input patterns x with desired output vector tx. The actual output of the network
when we put in x is net(x). Define error vector e as
e =tx – net(x).
This is simply the difference between what we want and what we get when we push x
into the net. If the net produces the correct output we don't need to make any changes –
but if it is wrong we modify the weights and bias as follows.
new w = old w + e x T.
new b = old b + e.
We cycle through each of the input vectors in turn modifying the weights if necessary
until the perceptron does what we want. We hope that the problem can be solved – if
there is a solution this will find it.
Example
Lets train a perceptron to discriminate between two points.
[1; 0] - output 1.
[0; 1] - output 0.
Lets create a perceptron with random w and b and train it using adapt.
Start with a random w and b - say w = [0.3 0.9] and b = [0.3].
If these values work then fine – if not we use the rules to change them.
What happens when we try input = [1; 0]?
Intro to Artificial Intelligence
Intro to Artificial Intelligence
What can a single perceptron learn?
Only linearly separable sets of data.
Example which a single perceptron cannot learn.
An example of this is the exclusive-or function.
x
y
out
0
0
0
1
0
1
0
1
1
1
1
0
Table 2.1: The Exclusive-or Problem
We cannot train a perceptron to reject [0; 0], reject [1; 1] and accept [1; 0],
accept [0; 1]
y
1  out = 1
 out
out = 0
0 
0

=0
out = 1
x
For a single neuron with threshold activation function with two inputs and one output it
can be shown that there are no two weights that will produce a solution.
Most real world problems are not linearly separable.
If more layers are allowed we can solve problems which aren't linearly separable.
Intro to Artificial Intelligence
Portfolio Exercise
1
Complete the training of the perceptron which was not completed in the lecture.
2
Matlab Demos. See the lab sheet on matlab demos and make sure that you work
through them.
Download