Lecture 25 The Cognitive Inversion Lecture 24 4/15/04 15:17

advertisement
Lecture 24
4/15/04 15:17
The Cognitive Inversion
• Computers can do some things very well that are difficult
for people
–
–
–
–
Lecture 25
Artificial Intelligence:
Recognition Tasks
(S&G, §13.4)
e.g., arithmetic calculations
playing chess & other board games
doing proofs in formal logic & mathematics
handling large amounts of data precisely
• But computers are very bad at some things that are easy for
people (and even some animals)
– e.g., face recognition & general object recognition
– autonomous locomotion
– sensory-motor coordination
• Conclusion: brains work very differently from Von
Neumann computers
4/15/04
CS 100 - Lecture 24
1
4/15/04
• Typical recognition
tasks take less than
one second
• Neurons take several
milliseconds to fire
• Therefore then can be
at most about 100
sequential processing
steps
CS 100 - Lecture 24
2
Two Different Approaches to
Computing
The 100-Step Rule
4/15/04
CS 100 - Lecture 24
Von Neumann: Narrow but Deep
Neural Computation: Shallow but Wide
…
3
4/15/04
How Wide?
CS 100 - Lecture 24
4
A Very Brief Tour of
Real Neurons
Retina:
100 million receptors
(and Real Brains)
Optic nerve: one million nerve fibers
4/15/04
CS 100
CS 100 - Lecture 24
5
4/15/04
CS 100 - Lecture 24
6
1
Lecture 24
4/15/04 15:17
Left Hemisphere
4/15/04
CS 100 - Lecture 24
7
4/15/04
CS 100 - Lecture 24
8
(fig. from internet)
Grey Matter vs. White Matter
Typical
Neuron
4/15/04
CS 100 - Lecture 24
9
4/15/04
CS 100 - Lecture 24
10
(fig. from Carter 1998)
Relative Cortical Areas
Neural Density in Cortex
• 148 000 neurons / sq. mm
• Hence, about 15 million / sq. cm
4/15/04
CS 100
CS 100 - Lecture 24
11
4/15/04
CS 100 - Lecture 24
12
2
Lecture 24
4/15/04 15:17
Macaque Visual System
Information Representation
in the Brain
4/15/04
CS 100 - Lecture 24
13
4/15/04
CS 100 - Lecture 24
14
(fig. from Van Essen & al. 1992)
Hierarchy
of
Macaque
Visual
Areas
4/15/04
Bat Auditory
Cortex
CS 100 - Lecture 24
15
4/15/04
(fig. from Van Essen & al. 1992)
CS 100 - Lecture 24
16
(figs. from Suga, 1985)
Typical Neuron
Real Neurons
4/15/04
CS 100
CS 100 - Lecture 24
17
4/15/04
CS 100 - Lecture 24
18
3
Lecture 24
4/15/04 15:17
Dendritic Trees of Some Neurons
A.
B.
C.
D.
E.
F.
G.
H.
I.
J.
K.
L.
4/15/04
Layers and Minicolumns
inferior olivary nucleus
granule cell of cerebellar cortex
small cell of reticular formation
small gelatinosa cell of spinal
trigeminal nucleus
ovoid cell, nucleus of tractus solitarius
large cell of reticular formation
spindle-shaped cell, substantia
gelatinosa of spinal chord
large cell of spinal trigeminal nucleus
putamen of lenticular nucleus
double pyramidal cell, Ammon’s horn
of hippocampal cortex
thalamic nucleus
globus pallidus of lenticular nucleus
CS 100 - Lecture 24
19
4/15/04
(fig. from Trues & Carpenter, 1964)
Neural
Networks in
Visual
System of
Frog
4/15/04
CS 100 - Lecture 24
20
(fig. from Arbib 1995, p. 270)
Frequency
Coding
CS 100 - Lecture 24
21
4/15/04
(fig. from Arbib 1995, p. 1039)
CS 100 - Lecture 24
22
(fig. from Anderson, Intr. Neur. Nets)
Variations in Spiking Behavior
Chemical Synapse
1.
2.
3.
4.
4/15/04
CS 100 - Lecture 24
23
4/15/04
CS 100 - Lecture 24
Action potential
arrives at synapse
Ca ions enter cell
Vesicles move to
membrane, release
neurotransmitter
Transmitter
crosses cleft,
causes
postsynaptic
voltage change
24
(fig. from Anderson, Intr. Neur. Nets)
CS 100
4
Lecture 24
4/15/04 15:17
Neurons are Not Logic Gates
• Speed
– electronic logic gates are very fast (nanoseconds)
– neurons are comparatively slow (milliseconds)
Artificial Neural Networks
• Precision
– logic gates are highly reliable digital devices
– neurons are imprecise analog devices
• Connections
– logic gates have few inputs (usually 1 to 3)
– many neurons have >100 000 inputs
4/15/04
CS 100 - Lecture 24
25
4/15/04
Typical Artificial Neuron
CS 100 - Lecture 24
26
Typical Artificial Neuron
connection
weights
summation
inputs
activation
function
output
net input
threshold
4/15/04
CS 100 - Lecture 24
27
Operation of Artificial Neuron
4/15/04
1×2 = 2
1 × 1 = –1
0×4 = 0
1
2
1
–1
Σ
2
3
3>2
1
1
4
CS 100
–1
Σ
1
1<2
0
4
2
1
4/15/04
28
Operation of Artificial Neuron
0×2 = 0
1 × 1 = –1
1×4 = 4
0
CS 100 - Lecture 24
CS 100 - Lecture 24
2
0
29
4/15/04
CS 100 - Lecture 24
30
5
Lecture 24
4/15/04 15:17
31
4/15/04
...
...
...
...
...
...
CS 100 - Lecture 24
34
35
4/15/04
CS 100 - Lecture 24
...
...
...
...
Feedforward Network In
Operation (5)
...
...
...
...
...
...
CS 100
CS 100 - Lecture 24
...
4/15/04
Feedforward Network In
Operation (4)
4/15/04
...
...
33
...
CS 100 - Lecture 24
32
Feedforward Network In
Operation (3)
...
...
...
...
...
...
Feedforward Network In
Operation (2)
CS 100 - Lecture 24
...
CS 100 - Lecture 24
4/15/04
...
...
output
layer
hidden
layers
4/15/04
...
...
...
...
...
...
input
layer
input
...
Feedforward Network In
Operation (1)
Feedforward Network
36
6
Lecture 24
4/15/04 15:17
4/15/04
CS 100 - Lecture 24
37
...
...
...
...
...
...
...
...
...
CS 100 - Lecture 24
38
Comparison with Non-Neural
Net Approaches
Feedforward Network In
Operation (8)
...
4/15/04
...
...
Feedforward Network In
Operation (7)
...
...
...
...
...
...
Feedforward Network In
Operation (6)
output
• Non-NN approaches typically decide output
from a small number of dominant factors
• NNs typically look at a large number of
factors, each of which weakly influences
output
• NNs permit:
– subtle discriminations
– holistic judgments
– context sensitivity
4/15/04
CS 100 - Lecture 24
39
4/15/04
Connectionist Architectures
– each knowledge item is distributed over many
connections
– each connection encodes many knowledge items
• Memory & processing is robust in face of damage,
errors, inaccuracy, noise, …
CS 100
CS 100 - Lecture 24
40
Differences from
Digital Calculation
• The knowledge is implicit in the connection
weights between the neurons
• Items of knowledge are not stored in dedicated
memory locations, as in a Von Neumann
computer
• “Holographic” knowledge representation:
4/15/04
CS 100 - Lecture 24
41
• Information represented in continuous
images (rather than language-like
structures)
• Information processing by continuous
image processing (rather than explicit rules
applied in individual steps)
• Indefiniteness is inevitable (rather than
definiteness assumed)
4/15/04
CS 100 - Lecture 24
42
7
Lecture 24
4/15/04 15:17
Learning for Output Neuron (1)
Supervised Learning
• Produce desired outputs for training inputs
• Generalize reasonably & appropriately to
other inputs
• Good example: pattern recognition
• Neural nets are trained rather than
programmed
0
1
Σ
3
3>2
43
Learning for Output Neuron (2)
1
1
No change!
2
1
CS 100 - Lecture 24
4/15/04
CS 100 - Lecture 24
44
Learning for Output Neuron (3)
0
1
Desired
output:
2
1
–1
4
– another difference from Von Neumann
computation
4/15/04
Desired
output:
2
–1.5 –1
X
Σ
3
X
1.75
3>2
XXX
1.75
<2
2
4/15/04
1
0
Output is
too large
X
4 3.25
1
X
1 0
CS 100 - Lecture 24
Desired
output:
2
–1
Σ
1
1<2
No change!
4
2
0
45
0
0
4/15/04
CS 100 - Lecture 24
46
Credit Assignment Problem
Learning for Output Neuron (4)
How do we adjust the weights of the hidden layers?
4
2
0
4/15/04
CS 100
CS 100 - Lecture 24
X
0 1
1
Output is
too small
input
layer
47
4/15/04
hidden
layers
CS 100 - Lecture 24
...
1<2
XXX
2.25
>2
...
1
X
2.25
...
Σ
...
–0.5 –1
X
...
1
...
Desired
output:
X
2 2.75
...
1
Desired
output
output
layer
48
8
Lecture 24
4/15/04 15:17
49
51
4/15/04
...
...
...
...
...
...
4/15/04
...
Desired
output
Input
53
...
...
...
...
...
...
...
...
...
...
CS 100
52
...
Desired
output
CS 100 - Lecture 24
CS 100 - Lecture 24
Back-Propagation:
Correct All Hidden Layers
...
4/15/04
Desired
output
Input
Back-Propagation:
Correct All Hidden Layers
Input
...
...
...
...
...
...
...
CS 100 - Lecture 24
50
...
Desired
output
4/15/04
CS 100 - Lecture 24
Back-Propagation:
Correct Last Hidden Layer
...
...
Back-Propagation:
Correct Output Neuron Weights
Input
...
4/15/04
...
CS 100 - Lecture 24
Desired
output
Input
...
4/15/04
...
...
...
...
...
...
Desired
output
Input
...
...
Back-Propagation:
Actual vs. Desired Output
...
...
Back-Propagation:
Forward Pass
CS 100 - Lecture 24
54
9
Lecture 24
4/15/04 15:17
4/15/04
...
Desired
output
Input
55
...
...
...
...
...
...
...
...
...
...
CS 100 - Lecture 24
...
Desired
output
Input
4/15/04
Back-Propagation:
Correct First Hidden Layer
...
...
Back-Propagation:
Correct All Hidden Layers
CS 100 - Lecture 24
56
ANN Training Procedures
Use of Back-Propagation (BP)
• Typically the weights are changed slowly
• Typically net will not give correct outputs
for all training inputs after one adjustment
• Each input/output pair is used repeatedly for
training
• BP may be slow
• But there are many better ANN learning
algorithms
• Supervised training: we show the net the
output it should produce for each training
input (e.g., BP)
• Reinforcement training: we tell the net if its
output is right or wrong, but not what the
correct output is
• Unsupervised training: the net attempts to
find patterns in its environment without
external guidance
4/15/04
4/15/04
CS 100 - Lecture 24
57
CS 100 - Lecture 24
58
Applications of ANNs
• “Neural nets are the second-best way of
doing everything”
• If you really understand a problem, you can
design a special purpose algorithm for it,
which will beat a NN
• However, if you don’t understand your
problem very well, you can generally train a
NN to do it well enough
4/15/04
CS 100
CS 100 - Lecture 24
59
10
Download