Lecture 24 4/15/04 15:20
CS 100
(S&G, § 13.4)
CS 100 - Lecture 24 4/15/04
The Cognitive Inversion
• Computers can do some things very well that are difficult for people
– e.g., arithmetic calculations
– playing chess & other board games
– doing proofs in formal logic & mathematics
– handling large amounts of data precisely
• But computers are very bad at some things that are easy for people (and even some animals)
– e.g., face recognition & general object recognition
– autonomous locomotion
– sensory-motor coordination
• Conclusion: brains work very differently from Von
Neumann computers
4/15/04 CS 100 - Lecture 24 2
1
1
Lecture 24
4/15/04
The 100-Step Rule
• Typical recognition tasks take less than one second
• Neurons take several milliseconds to fire
• Therefore then can be at most about 100 sequential processing steps
CS 100 - Lecture 24 3
Two Different Approaches to
Computing
Von Neumann: Narrow but Deep
4/15/04
Neural Computation: Shallow but Wide
…
CS 100 - Lecture 24 4
4/15/04 15:20
CS 100 2
Lecture 24
Retina:
100 million receptors
How Wide?
4/15/04
Optic nerve: one million nerve fibers
CS 100 - Lecture 24 5
A Very Brief Tour of
Real Neurons
4/15/04
(and Real Brains)
CS 100 - Lecture 24 6
4/15/04 15:20
CS 100 3
Lecture 24 4/15/04 15:20
4/15/04 CS 100 - Lecture 24
(fig. from internet)
Left Hemisphere
7
CS 100
4/15/04 CS 100 - Lecture 24 8
4
Lecture 24
Typical
Neuron
4/15/04 CS 100 - Lecture 24
Grey Matter vs. White Matter
9
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24
(fig. from Carter 1998)
10
5
Lecture 24
Neural Density in Cortex
4/15/04
• 148 000 neurons / sq. mm
• Hence, about 15 million / sq. cm
CS 100 - Lecture 24 11
Relative Cortical Areas
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24 12
6
Lecture 24
Information Representation in the Brain
4/15/04 CS 100 - Lecture 24
Macaque Visual System
13
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24
(fig. from Van Essen & al. 1992)
14
7
Lecture 24
Hierarchy of
Macaque
Visual
Areas
4/15/04 CS 100 - Lecture 24
(fig. from Van Essen & al. 1992)
15
Bat Auditory
Cortex
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24
(figs. from Suga, 1985)
16
8
Lecture 24 4/15/04 15:20
Real Neurons
4/15/04 CS 100 - Lecture 24
Typical Neuron
17
CS 100
4/15/04 CS 100 - Lecture 24 18
9
Lecture 24
Dendritic Trees of Some Neurons
A.
inferior olivary nucleus
B.
granule cell of cerebellar cortex
C.
small cell of reticular formation
D.
small gelatinosa cell of spinal trigeminal nucleus
E.
ovoid cell, nucleus of tractus solitarius
F.
large cell of reticular formation
G.
spindle-shaped cell, substantia gelatinosa of spinal chord
H.
large cell of spinal trigeminal nucleus
I.
putamen of lenticular nucleus
J.
double pyramidal cell, Ammon’s horn of hippocampal cortex
K.
thalamic nucleus
L.
globus pallidus of lenticular nucleus
4/15/04 CS 100 - Lecture 24
(fig. from Trues & Carpenter, 1964)
19
Layers and Minicolumns
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24
(fig. from Arbib 1995, p. 270)
20
10
Lecture 24
CS 100
Neural
Networks in
Visual
System of
Frog
4/15/04 CS 100 - Lecture 24
(fig. from Arbib 1995, p. 1039)
21
Frequency
Coding
4/15/04 CS 100 - Lecture 24
(fig. from Anderson, Intr. Neur. Nets )
22
4/15/04 15:20
11
Lecture 24
Variations in Spiking Behavior
4/15/04 15:20
CS 100
4/15/04 CS 100 - Lecture 24
4/15/04
Chemical Synapse
1.
Action potential arrives at synapse
2.
Ca ions enter cell
3.
Vesicles move to membrane, release neurotransmitter
4.
Transmitter crosses cleft, causes postsynaptic voltage change
CS 100 - Lecture 24
(fig. from Anderson, Intr. Neur. Nets )
24
23
12
Lecture 24
Neurons are Not Logic Gates
• Speed
– electronic logic gates are very fast (nanoseconds)
– neurons are comparatively slow (milliseconds)
• Precision
– logic gates are highly reliable digital devices
– neurons are imprecise analog devices
• Connections
– logic gates have few inputs (usually 1 to 3)
– many neurons have >100 000 inputs
4/15/04 CS 100 - Lecture 24 25
4/15/04 15:20
CS 100
Artificial Neural Networks
4/15/04 CS 100 - Lecture 24 26
13
Lecture 24 inputs
Typical Artificial Neuron connection weights output threshold
CS 100 - Lecture 24 4/15/04 27
4/15/04
Typical Artificial Neuron summation activation function
CS 100 - Lecture 24 net input
28
4/15/04 15:20
CS 100 14
Lecture 24
4/15/04
Operation of Artificial Neuron
0
1
–1
2
0 × 2 = 0
1 × 1 = –1
1 × 4 = 4
3
Σ
3 > 2
1
4
1 2
CS 100 - Lecture 24 29
4/15/04
Operation of Artificial Neuron
1
1 –1
2
1 × 2 = 2
1 × 1 = –1
0 × 4 = 0
1
Σ
1 < 2
0
4
0 2
CS 100 - Lecture 24 30
4/15/04 15:20
CS 100 15
Lecture 24 4/15/04 15:20
Feedforward Network
CS 100 input input layer
4/15/04 hidden layers
CS 100 - Lecture 24 output layer
31
Feedforward Network In
Operation (1)
4/15/04 CS 100 - Lecture 24 32
16
Lecture 24
Feedforward Network In
Operation (2)
4/15/04 15:20
4/15/04 CS 100 - Lecture 24
Feedforward Network In
Operation (3)
33
CS 100
4/15/04 CS 100 - Lecture 24 34
17
Lecture 24
Feedforward Network In
Operation (4)
4/15/04 15:20
4/15/04 CS 100 - Lecture 24
Feedforward Network In
Operation (5)
35
CS 100
4/15/04 CS 100 - Lecture 24 36
18
Lecture 24
Feedforward Network In
Operation (6)
4/15/04 15:20
4/15/04 CS 100 - Lecture 24
Feedforward Network In
Operation (7)
37
CS 100
4/15/04 CS 100 - Lecture 24 38
19
Lecture 24
Feedforward Network In
Operation (8) output
4/15/04 CS 100 - Lecture 24
Comparison with Non-Neural
Net Approaches
• Non-NN approaches typically decide output from a small number of dominant factors
• NNs typically look at a large number of factors, each of which weakly influences output
• NNs permit:
– subtle discriminations
– holistic judgments
– context sensitivity
4/15/04 CS 100 - Lecture 24 40
39
4/15/04 15:20
CS 100 20
Lecture 24
Connectionist Architectures
• The knowledge is implicit in the connection weights between the neurons
• Items of knowledge are not stored in dedicated memory locations, as in a Von Neumann computer
• “Holographic” knowledge representation:
– each knowledge item is distributed over many connections
– each connection encodes many knowledge items
• Memory & processing is robust in face of damage, errors, inaccuracy, noise, …
4/15/04 CS 100 - Lecture 24 41
Differences from
Digital Calculation
• Information represented in continuous images (rather than language-like structures)
• Information processing by continuous image processing (rather than explicit rules applied in individual steps)
• Indefiniteness is inevitable (rather than definiteness assumed)
4/15/04 CS 100 - Lecture 24 42
4/15/04 15:20
CS 100 21
Lecture 24
Supervised Learning
• Produce desired outputs for training inputs
• Generalize reasonably & appropriately to other inputs
• Good example: pattern recognition
• Neural nets are trained rather than programmed
– another difference from Von Neumann computation
4/15/04 CS 100 - Lecture 24 43
Learning for Output Neuron (1)
0
1
1
2
–1
Σ
4
3
2
3 > 2
1
Desired output:
1
No change!
4/15/04 CS 100 - Lecture 24 44
4/15/04 15:20
CS 100 22
Lecture 24
Learning for Output Neuron (2)
0
1
2
–1.5 –1
Σ
1.75
4 3.25
1 2
1.75
< 2
Desired output:
1 0 0
Output is too large
4/15/04 CS 100 - Lecture 24 45
Learning for Output Neuron (3)
1
1
0
2
–1
Σ
4
1
2
1 < 2
0
Desired output:
0
No change!
4/15/04 CS 100 - Lecture 24 46
4/15/04 15:20
CS 100 23
Lecture 24
Learning for Output Neuron (4)
1
1
2 2.75
–0.5 –1
Σ
2.25
4
0 2
2.25
> 2
Desired output:
0 1 1
Output is too small
4/15/04 CS 100 - Lecture 24 47
Credit Assignment Problem
How do we adjust the weights of the hidden layers?
Desired output input layer
4/15/04 hidden layers
CS 100 - Lecture 24 output layer
48
4/15/04 15:20
CS 100 24
Lecture 24
Input
4/15/04
Back-Propagation:
Forward Pass
Input
4/15/04 CS 100 - Lecture 24
Desired output
49
Back-Propagation:
Actual vs. Desired Output
CS 100 - Lecture 24
Desired output
50
4/15/04 15:20
CS 100 25
Lecture 24
Input
4/15/04
Back-Propagation:
Correct Output Neuron Weights
Input
4/15/04 CS 100 - Lecture 24
Desired output
51
Back-Propagation:
Correct Last Hidden Layer
CS 100 - Lecture 24
Desired output
52
4/15/04 15:20
CS 100 26
Lecture 24
Input
4/15/04
Back-Propagation:
Correct All Hidden Layers
Input
4/15/04 CS 100 - Lecture 24
Desired output
53
Back-Propagation:
Correct All Hidden Layers
CS 100 - Lecture 24
Desired output
54
4/15/04 15:20
CS 100 27
Lecture 24
Input
4/15/04
Back-Propagation:
Correct All Hidden Layers
Input
4/15/04 CS 100 - Lecture 24
Desired output
55
Back-Propagation:
Correct First Hidden Layer
CS 100 - Lecture 24
Desired output
56
4/15/04 15:20
CS 100 28
Lecture 24
Use of Back-Propagation (BP)
• Typically the weights are changed slowly
• Typically net will not give correct outputs for all training inputs after one adjustment
• Each input/output pair is used repeatedly for training
• BP may be slow
• But there are many better ANN learning algorithms
4/15/04 CS 100 - Lecture 24 57
ANN Training Procedures
• Supervised training : we show the net the output it should produce for each training input (e.g., BP)
• Reinforcement training : we tell the net if its output is right or wrong, but not what the correct output is
• Unsupervised training : the net attempts to find patterns in its environment without external guidance
4/15/04 CS 100 - Lecture 24 58
4/15/04 15:20
CS 100 29
Lecture 24
Applications of ANNs
• “Neural nets are the second-best way of doing everything”
• If you really understand a problem, you can design a special purpose algorithm for it, which will beat a NN
• However, if you don’t understand your problem very well, you can generally train a
NN to do it well enough
4/15/04 CS 100 - Lecture 24 59
4/15/04 15:20
CS 100 30