Artificial Neural Networks … let us move on to… March 31, 2016

advertisement
… let us move on to…
Artificial
Neural Networks
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
1
Computers vs. Neural Networks
“Standard” Computers
Neural Networks
one CPU / few processing
cores
highly parallel
processing
fast processing units
slow processing units
reliable units
unreliable units
static infrastructure
dynamic infrastructure
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
2
Why Artificial Neural Networks?
There are two basic reasons why we are interested in
building artificial neural networks (ANNs):
• Technical viewpoint: Some problems such as
character recognition or the prediction of future
states of a system require massively parallel and
adaptive processing.
• Biological viewpoint: ANNs can be used to
replicate and simulate components of the human
(or animal) brain, thereby giving us insight into
natural information processing.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
3
Why Artificial Neural Networks?
Why do we need another paradigm than symbolic AI
for building “intelligent” machines?
• Symbolic AI is well-suited for representing explicit
knowledge that can be appropriately formalized.
• However, learning in biological systems is mostly
implicit – it is an adaptation process based on
uncertain information and reasoning.
• ANNs are inherently parallel and work extremely
efficiently if implemented in parallel hardware.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
4
How do NNs and ANNs work?
• The “building blocks” of neural networks are the
neurons.
• In technical systems, we also refer to them as units
or nodes.
• Basically, each neuron
– receives input from many other neurons,
– changes its internal state (activation) based on
the current input,
– sends one output signal to many other
neurons, possibly including its input neurons
(recurrent network)
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
5
How do NNs and ANNs work?
• Information is transmitted as a series of electric
impulses, so-called spikes.
• The frequency and phase of these spikes encodes
the information.
• In biological systems, one neuron can be
connected to as many as 10,000 other neurons.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
6
Structure of NNs (and some ANNs)
• In biological systems, neurons of similar
functionality are usually organized in separate
areas (or layers).
• Often, there is a hierarchy of interconnected layers
with the lowest layer receiving sensory input and
neurons in higher layers computing more complex
functions.
• For example, neurons in macaque visual cortex
have been identified that are activated only when
there is a face (monkey, human, or drawing) in the
macaque’s visual field.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
7
“Data Flow Diagram”
of Visual Areas in
Macaque Brain
Blue:
motion perception
pathway
Green:
object recognition
pathway
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
8
Stimuli in
receptive
field of
neuron
March 31, 2016
Introduction
to
Artificial Intelligence
9
March 31, 2016
Introduction
to
Artificial Intelligence
10
Structure of NNs (and some ANNs)
• In a hierarchy of neural areas such as the visual
system, those at the bottom (near the sensory
“input” neurons) only “see” local information.
• For example, each neuron in primary visual cortex
only receives input from a small area (1 in
diameter) of the visual field (called their receptive
field).
• As we move towards higher areas, the responses
of neurons become less and less location
dependent.
• In inferotemporal cortex, some neurons respond to
face stimuli shown at any position in the visual field.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
11
Receptive Fields in Hierarchical Neural Networks
neuron A
March 31, 2016
receptive field of A
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
12
Receptive Fields in Hierarchical Neural Networks
neuron B
in top layer
receptive field of B in input layer
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
13
How do NNs and ANNs work?
• NNs are able to learn by adapting their
connectivity patterns so that the organism
improves its behavior in terms of reaching certain
(evolutionary) goals.
• The strength of a connection, or whether it is
excitatory or inhibitory, depends on the state of a
receiving neuron’s synapses.
• The NN achieves learning by appropriately
adapting the states of its synapses.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
14
An Artificial Neuron
synapses
x1
neuron i
x2
Wi,1
Wi,2
…
…
xi
Wi,n
xn
n
net input signal
net i (t )   wi , j (t ) x j (t )
j 1
output
March 31, 2016
x i (t )  f i (neti (t ))
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
15
The Net Input Signal
The net input signal is the sum of all inputs after
passing the synapses:
n
net i (t )   wi , j (t ) x j (t )
j 1
This can be viewed as computing the inner product
of the vectors wi and x:
net i (t )  || wi (t ) ||  || x(t ) ||  cos  ,
where  is the angle between the two vectors.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
16
The Activation Function
One possible choice is a threshold function:
f i (net i (t ))  1, if net i (t )  
 0, otherwise
The graph of this function looks like this:
fi(neti(t))
1
0

March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
neti(t)
17
Binary Analogy: Threshold Logic Units
Example:
x1
x2
w1 = 1
w2 = 1
 = 1.5
x1 x2 x3
w3 = -1
x3
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
18
Networks
Yet another example:
x1
w1 =
XOR
=
w2 =
x1  x2
x2
Impossible! TLUs can only realize linearly separable
functions.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
19
Linear Separability
A function f:{0, 1}n  {0, 1} is linearly separable if the
space of input vectors yielding 1 can be separated
from those yielding 0 by a linear surface
(hyperplane) in n dimensions.
Examples (two dimensions):
x2
1
1
1
0
0
1
0
1
x2
x1
1
0
0
0
1
0
1
x1
linearly separable
March 31, 2016
1
linearly inseparable
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
20
Linear Separability
To explain linear separability, let us consider the
function f:Rn  {0, 1} with
f ( x1 , x2 ,..., xn )  1, if
n
w x
i 1
i i

 0, otherwise
where x1, x2, …, xn represent real numbers.
This will also be useful for understanding the
computations of artificial neural networks.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
21
Linear Separability
n
w x
f ( x1 , x2 ,..., xn )  1, if
i i
i 1

 0, otherwise
Input space in the two-dimensional case (n = 2):
x2
-3 -2 -1
0
3
2
1
1
1
-1
-2
-3
2
1
3
x1
w1 = 1, w2 = 2,
=2
March 31, 2016
x2
-3 -2 -1
3
2
1
1
1
-1
-2
-3
2
0
3
x1
w1 = -2, w2 = 1,
=2
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
x2
-3 -2 -1
3
2
1
1
-1
-2
-3
2
0
3
x1
w1 = -2, w2 = 1,
=1
22
Linear Separability
So by varying the weights and the threshold, we can
realize any linear separation of the input space into
a region that yields output 1, and another region that
yields output 0.
As we have seen, a two-dimensional input space
can be divided by any straight line.
A three-dimensional input space can be divided by
any two-dimensional plane.
In general, an n-dimensional input space can be
divided by an (n-1)-dimensional plane or hyperplane.
Of course, for n > 3 this is hard to visualize.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
23
Linear Separability
Of course, the same applies to our original function f
using binary input values.
The only difference is the restriction in the input
values.
Obviously, we cannot find a straight line to realize the
XOR function:
x2 1 1 0
0
0
1
0
1 x1
In order to realize XOR with TLUs, we need to
combine multiple TLUs into a network.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
24
Multi-Layered XOR Network
x1
x2
1
-1
0.5
1
1
x1 -1
x2
1
March 31, 2016
0.5
x1  x2
0.5
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
25
Capabilities of Threshold Neurons
What can threshold neurons do for us?
To keep things simple, let us consider such a neuron
with two inputs:
x1
Wi,1
Wi,2
xi
x2
The computation of this neuron can be described as
the inner product of the two-dimensional vectors x
and wi, followed by a threshold operation.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
26
Capabilities of Threshold Neurons
Let us assume that the threshold  = 0 and illustrate the
function computed by the neuron for sample vectors wi and x:
second vector component
x
wi
first vector component
Since the inner product is positive for -90    90, in this
example the neuron’s output is 1 for any input vector x to the
right of or on the dotted line, and 0 for any other input vector.
March 31, 2016
Introduction to Artificial Intelligence
Lecture 16: Neural Network Paradigms I
27
Download