Lecture 25 Artificial Intelligence: Recognition Tasks (S&G, §13.4)

advertisement
Lecture 25
Artificial Intelligence:
Recognition Tasks
(S&G, §13.4)
5/29/2016
CS 100 - Lecture 24
1
The Cognitive Inversion
• Computers can do some things very well that are difficult
for people
–
–
–
–
e.g., arithmetic calculations
playing chess & other board games
doing proofs in formal logic & mathematics
handling large amounts of data precisely
• But computers are very bad at some things that are easy for
people (and even some animals)
– e.g., face recognition & general object recognition
– autonomous locomotion
– sensory-motor coordination
• Conclusion: brains work very differently from Von
Neumann computers
5/29/2016
CS 100 - Lecture 24
2
The 100-Step Rule
• Typical recognition
tasks take less than
one second
• Neurons take several
milliseconds to fire
• Therefore then can be
at most about 100
sequential processing
steps
5/29/2016
CS 100 - Lecture 24
3
Two Different Approaches to
Computing
Von Neumann: Narrow but Deep
Neural Computation: Shallow but Wide
…
5/29/2016
CS 100 - Lecture 24
4
How Wide?
Retina:
100 million receptors
Optic nerve: one million nerve fibers
5/29/2016
CS 100 - Lecture 24
5
A Very Brief Tour of
Real Neurons
(and Real Brains)
5/29/2016
CS 100 - Lecture 24
6
5/29/2016
CS 100 - Lecture 24
(fig. from internet)
7
Left Hemisphere
5/29/2016
CS 100 - Lecture 24
8
Typical
Neuron
5/29/2016
CS 100 - Lecture 24
9
Grey Matter vs. White Matter
5/29/2016
CS 100 - Lecture 24
(fig. from Carter 1998)
10
Neural Density in Cortex
• 148 000 neurons / sq. mm
• Hence, about 15 million / sq. cm
5/29/2016
CS 100 - Lecture 24
11
Relative Cortical Areas
5/29/2016
CS 100 - Lecture 24
12
Information Representation
in the Brain
5/29/2016
CS 100 - Lecture 24
13
Macaque Visual System
5/29/2016
CS 100 - Lecture 24
(fig. from Van Essen & al. 1992)
14
Hierarchy
of
Macaque
Visual
Areas
5/29/2016
CS 100 - Lecture 24
(fig. from Van Essen & al. 1992)
15
Bat Auditory
Cortex
5/29/2016
CS 100 - Lecture 24
(figs. from Suga, 1985)
16
Real Neurons
5/29/2016
CS 100 - Lecture 24
17
Typical Neuron
5/29/2016
CS 100 - Lecture 24
18
Dendritic Trees of Some Neurons
A.
B.
C.
D.
E.
F.
G.
H.
I.
J.
K.
L.
5/29/2016
inferior olivary nucleus
granule cell of cerebellar cortex
small cell of reticular formation
small gelatinosa cell of spinal
trigeminal nucleus
ovoid cell, nucleus of tractus solitarius
large cell of reticular formation
spindle-shaped cell, substantia
gelatinosa of spinal chord
large cell of spinal trigeminal nucleus
putamen of lenticular nucleus
double pyramidal cell, Ammon’s horn
of hippocampal cortex
thalamic nucleus
globus pallidus of lenticular nucleus
CS 100 - Lecture 24
(fig. from Trues & Carpenter, 1964)
19
Layers and Minicolumns
5/29/2016
CS 100 - Lecture 24
(fig. from Arbib 1995, p. 270)
20
Neural
Networks in
Visual
System of
Frog
5/29/2016
CS 100 - Lecture 24
(fig. from Arbib 1995, p. 1039)
21
Frequency
Coding
5/29/2016
CS 100 - Lecture 24
(fig. from Anderson, Intr. Neur. Nets)
22
Variations in Spiking Behavior
5/29/2016
CS 100 - Lecture 24
23
Chemical Synapse
1.
2.
3.
4.
5/29/2016
CS 100 - Lecture 24
(fig. from Anderson, Intr. Neur. Nets)
Action potential
arrives at synapse
Ca ions enter cell
Vesicles move to
membrane, release
neurotransmitter
Transmitter
crosses cleft,
causes
postsynaptic
voltage change
24
Neurons are Not Logic Gates
• Speed
– electronic logic gates are very fast (nanoseconds)
– neurons are comparatively slow (milliseconds)
• Precision
– logic gates are highly reliable digital devices
– neurons are imprecise analog devices
• Connections
– logic gates have few inputs (usually 1 to 3)
– many neurons have >100 000 inputs
5/29/2016
CS 100 - Lecture 24
25
Artificial Neural Networks
5/29/2016
CS 100 - Lecture 24
26
Typical Artificial Neuron
connection
weights
inputs
output
threshold
5/29/2016
CS 100 - Lecture 24
27
Typical Artificial Neuron
summation
activation
function
net input
5/29/2016
CS 100 - Lecture 24
28
Operation of Artificial Neuron
02 = 0
1  1 = –1
14 = 4
0
2
1
–1
S
3
3>2
1
4
2
1
5/29/2016
CS 100 - Lecture 24
29
Operation of Artificial Neuron
12 = 2
1  1 = –1
04 = 0
1
2
1
–1
S
1
1<2
0
4
2
0
5/29/2016
CS 100 - Lecture 24
30
input
layer
5/29/2016
hidden
layers
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network
output
layer
31
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
input
...
Feedforward Network In
Operation (1)
32
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (2)
33
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (3)
34
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (4)
35
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (5)
36
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (6)
37
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (7)
38
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
...
Feedforward Network In
Operation (8)
output
39
Comparison with Non-Neural
Net Approaches
• Non-NN approaches typically decide output
from a small number of dominant factors
• NNs typically look at a large number of
factors, each of which weakly influences
output
• NNs permit:
– subtle discriminations
– holistic judgments
– context sensitivity
5/29/2016
CS 100 - Lecture 24
40
Connectionist Architectures
• The knowledge is implicit in the connection
weights between the neurons
• Items of knowledge are not stored in dedicated
memory locations, as in a Von Neumann computer
• “Holographic” knowledge representation:
– each knowledge item is distributed over many
connections
– each connection encodes many knowledge items
• Memory & processing is robust in face of damage,
errors, inaccuracy, noise, …
5/29/2016
CS 100 - Lecture 24
41
Differences from
Digital Calculation
• Information represented in continuous
images (rather than language-like
structures)
• Information processing by continuous
image processing (rather than explicit rules
applied in individual steps)
• Indefiniteness is inevitable (rather than
definiteness assumed)
5/29/2016
CS 100 - Lecture 24
42
Supervised Learning
• Produce desired outputs for training inputs
• Generalize reasonably & appropriately to
other inputs
• Good example: pattern recognition
• Neural nets are trained rather than
programmed
– another difference from Von Neumann
computation
5/29/2016
CS 100 - Lecture 24
43
Learning for Output Neuron (1)
0
Desired
output:
2
1
–1
S
3
3>2
1
No change!
4
2
1
5/29/2016
1
CS 100 - Lecture 24
44
Learning for Output Neuron (2)
0
Desired
output:
2
1
–1.5 –1
X
S
3
X
1.75
3>2
XXX
1.75
<2
X
4 3.25
2
1
5/29/2016
CS 100 - Lecture 24
X
1 0
0
Output is
too large
45
Learning for Output Neuron (3)
1
Desired
output:
2
1
–1
S
1
1<2
0
No change!
4
2
0
5/29/2016
0
CS 100 - Lecture 24
46
Learning for Output Neuron (4)
1
Desired
output:
X
2 2.75
1
–0.5 –1
X
S
1
X
2.25
1<2
XXX
2.25
>2
4
2
0
5/29/2016
CS 100 - Lecture 24
X
0 1
1
Output is
too small
47
Credit Assignment Problem
input
layer
5/29/2016
CS 100 - Lecture 24
...
...
...
...
...
hidden
layers
...
...
How do we adjust the weights of the hidden layers?
Desired
output
output
layer
48
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Forward Pass
CS 100 - Lecture 24
49
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Actual vs. Desired Output
CS 100 - Lecture 24
50
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct Output Neuron Weights
CS 100 - Lecture 24
51
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct Last Hidden Layer
CS 100 - Lecture 24
52
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct All Hidden Layers
CS 100 - Lecture 24
53
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct All Hidden Layers
CS 100 - Lecture 24
54
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct All Hidden Layers
CS 100 - Lecture 24
55
...
...
...
...
...
Desired
output
Input
5/29/2016
...
...
Back-Propagation:
Correct First Hidden Layer
CS 100 - Lecture 24
56
Use of Back-Propagation (BP)
• Typically the weights are changed slowly
• Typically net will not give correct outputs
for all training inputs after one adjustment
• Each input/output pair is used repeatedly for
training
• BP may be slow
• But there are many better ANN learning
algorithms
5/29/2016
CS 100 - Lecture 24
57
ANN Training Procedures
• Supervised training: we show the net the
output it should produce for each training
input (e.g., BP)
• Reinforcement training: we tell the net if its
output is right or wrong, but not what the
correct output is
• Unsupervised training: the net attempts to
find patterns in its environment without
external guidance
5/29/2016
CS 100 - Lecture 24
58
Applications of ANNs
• “Neural nets are the second-best way of
doing everything”
• If you really understand a problem, you can
design a special purpose algorithm for it,
which will beat a NN
• However, if you don’t understand your
problem very well, you can generally train a
NN to do it well enough
5/29/2016
CS 100 - Lecture 24
59
Download